2025-12-04T11:10:55.7718942Z Current runner version: '2.329.0' 2025-12-04T11:10:55.7721935Z Runner name: 'linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv' 2025-12-04T11:10:55.7722343Z Runner group name: 'default' 2025-12-04T11:10:55.7722759Z Machine name: 'linux' 2025-12-04T11:10:55.7723885Z ##[group]GITHUB_TOKEN Permissions 2025-12-04T11:10:55.7724973Z Contents: read 2025-12-04T11:10:55.7725216Z Metadata: read 2025-12-04T11:10:55.7725482Z ##[endgroup] 2025-12-04T11:10:55.7726529Z Secret source: Actions 2025-12-04T11:10:55.7726821Z Prepare workflow directory 2025-12-04T11:10:55.7969291Z Prepare all required actions 2025-12-04T11:10:55.7989337Z Getting action download info 2025-12-04T11:10:56.2251098Z Download action repository 'pytorch/pytorch@main' (SHA:c0cb6e78404416d418350632bfc554710a5f7281) 2025-12-04T11:10:59.9020029Z Download action repository 'pytorch/test-infra@main' (SHA:39aa74d619174326f4e2fb0e216151c2f29d9ffd) 2025-12-04T11:11:01.0509058Z Download action repository 'actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-12-04T11:11:01.8977373Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-12-04T11:11:02.7275798Z Getting action download info 2025-12-04T11:11:02.9237678Z Download action repository 'actions/checkout@v4' (SHA:34e114876b0b11c390a56381ad16ebd13914f8d5) 2025-12-04T11:11:03.7297003Z Getting action download info 2025-12-04T11:11:03.9131206Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-12-04T11:11:04.6883677Z Getting action download info 2025-12-04T11:11:04.8880007Z Uses: pytorch/pytorch/.github/workflows/_rocm-test.yml@refs/heads/main (ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32) 2025-12-04T11:11:04.8882101Z ##[group] Inputs 2025-12-04T11:11:04.8882275Z build-environment: linux-noble-rocm-py3.12-mi300 2025-12-04T11:11:04.8883618Z test-matrix: {"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}]} 2025-12-04T11:11:04.8885156Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:04.8885469Z sync-tag: 2025-12-04T11:11:04.8885987Z timeout-minutes: 300 2025-12-04T11:11:04.8886091Z tests-to-include: 2025-12-04T11:11:04.8886195Z dashboard-tag: 2025-12-04T11:11:04.8886425Z disable-monitor: true 2025-12-04T11:11:04.8886553Z monitor-log-interval: 5 2025-12-04T11:11:04.8886683Z monitor-data-collect-interval: 1 2025-12-04T11:11:04.8886807Z ##[endgroup] 2025-12-04T11:11:04.8887058Z Complete job name: linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check) 2025-12-04T11:11:04.9373905Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-12-04T11:11:04.9374395Z with: 2025-12-04T11:11:04.9374547Z no-sudo: true 2025-12-04T11:11:04.9374975Z submodules: recursive 2025-12-04T11:11:04.9375124Z fetch-depth: 0 2025-12-04T11:11:04.9375347Z env: 2025-12-04T11:11:04.9375511Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:04.9375716Z ##[endgroup] 2025-12-04T11:11:04.9463346Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T11:11:04.9463937Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T11:11:04.9472630Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:04.9472853Z env: 2025-12-04T11:11:04.9472984Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:04.9473132Z ##[endgroup] 2025-12-04T11:11:04.9629888Z ##[group]Run actions/checkout@v4 2025-12-04T11:11:04.9630062Z with: 2025-12-04T11:11:04.9630178Z ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:11:04.9630313Z fetch-depth: 0 2025-12-04T11:11:04.9630410Z submodules: recursive 2025-12-04T11:11:04.9630590Z show-progress: false 2025-12-04T11:11:04.9630696Z repository: pytorch/pytorch 2025-12-04T11:11:04.9630869Z token: *** 2025-12-04T11:11:04.9630956Z ssh-strict: true 2025-12-04T11:11:04.9631042Z ssh-user: git 2025-12-04T11:11:04.9631136Z persist-credentials: true 2025-12-04T11:11:04.9631240Z clean: true 2025-12-04T11:11:04.9631336Z sparse-checkout-cone-mode: true 2025-12-04T11:11:04.9631451Z fetch-tags: false 2025-12-04T11:11:04.9631541Z lfs: false 2025-12-04T11:11:04.9631628Z set-safe-directory: true 2025-12-04T11:11:04.9631727Z env: 2025-12-04T11:11:04.9631811Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:04.9631911Z ##[endgroup] 2025-12-04T11:11:05.0168356Z Syncing repository: pytorch/pytorch 2025-12-04T11:11:05.0168971Z ##[group]Getting Git version info 2025-12-04T11:11:05.0169142Z Working directory is '/home/runner/_work/pytorch/pytorch' 2025-12-04T11:11:05.0169392Z [command]/usr/bin/git version 2025-12-04T11:11:05.0169500Z git version 2.52.0 2025-12-04T11:11:05.0169880Z ##[endgroup] 2025-12-04T11:11:05.0172984Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/14ae3c61-701b-4715-81e3-50a9739370e1/.gitconfig' 2025-12-04T11:11:05.0177629Z Temporarily overriding HOME='/home/runner/_work/_temp/14ae3c61-701b-4715-81e3-50a9739370e1' before making global git config changes 2025-12-04T11:11:05.0177965Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T11:11:05.0179968Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T11:11:05.0204486Z [command]/usr/bin/git config --local --get remote.origin.url 2025-12-04T11:11:05.0230329Z https://github.com/pytorch/pytorch 2025-12-04T11:11:05.0245536Z ##[group]Removing previously created refs, to avoid conflicts 2025-12-04T11:11:05.0249314Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-12-04T11:11:05.0275819Z refs/heads/main 2025-12-04T11:11:05.0286813Z [command]/usr/bin/git checkout --detach 2025-12-04T11:11:06.7644138Z HEAD is now at c0cb6e784044 [DTensor] ExplicitRedistributionContext warning mode (#169452) 2025-12-04T11:11:06.7692079Z [command]/usr/bin/git branch --delete --force main 2025-12-04T11:11:06.7854912Z Deleted branch main (was c0cb6e784044). 2025-12-04T11:11:06.7862148Z ##[endgroup] 2025-12-04T11:11:06.7866795Z [command]/usr/bin/git submodule status 2025-12-04T11:11:06.8116839Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-12-04T11:11:06.8158659Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-12-04T11:11:06.8202397Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-12-04T11:11:06.8259735Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-12-04T11:11:06.8325455Z 3ebbc93ded7285963bff932c678fa367eb393ba6 third_party/NVTX (v3.1.0-313-g3ebbc93) 2025-12-04T11:11:06.8396564Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-12-04T11:11:06.8777914Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-12-04T11:11:06.8811595Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-12-04T11:11:06.8833103Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-12-04T11:11:06.8895947Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-12-04T11:11:06.8995956Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-12-04T11:11:06.9093571Z f858c30bcb16f8effd5ff46996f0514539e17abc third_party/cpuinfo (f858c30) 2025-12-04T11:11:06.9130054Z 0b1577c8c83401237d601d0d0db5210506705396 third_party/cudnn_frontend (v0.5-61-g0b1577c) 2025-12-04T11:11:06.9214994Z f88806b1e31dfa579842638740216dd41fc6c588 third_party/cutlass (v4.3.1) 2025-12-04T11:11:06.9247895Z c0b988d39a9e47c794d699f29930ed4d7c7e13a4 third_party/fbgemm (v1.4.0-rc1-2-gc0b988d39) 2025-12-04T11:11:06.9313650Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-12-04T11:11:06.9338962Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-12-04T11:11:06.9678879Z 407c905e45ad75fc29bf0f9bb7c5c2fd3475976f third_party/fmt (12.1.0) 2025-12-04T11:11:06.9736913Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-12-04T11:11:06.9821749Z 54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0) 2025-12-04T11:11:06.9958464Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-12-04T11:11:07.0009505Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-12-04T11:11:07.0045676Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-12-04T11:11:07.0161557Z 31f85df8fbd89c188f14ef10f1ec65379786b943 third_party/kineto (heads/main) 2025-12-04T11:11:07.0175826Z d7770c89632329a9914ef1a90289917597639cbe third_party/kleidiai (v1.15.0) 2025-12-04T11:11:07.0190111Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-12-04T11:11:07.0212973Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-12-04T11:11:07.0422476Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-12-04T11:11:07.0442195Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-12-04T11:11:07.0494103Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-12-04T11:11:07.0705687Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-12-04T11:11:07.0766428Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-12-04T11:11:07.0807750Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-12-04T11:11:07.0821217Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-12-04T11:11:07.0873556Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-12-04T11:11:07.0946254Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-12-04T11:11:07.1021094Z 2b4cd91092d335a697416b2a3cb398283246849d third_party/tensorpipe (heads/main) 2025-12-04T11:11:07.1032268Z ##[group]Cleaning the repository 2025-12-04T11:11:07.1035362Z [command]/usr/bin/git clean -ffdx 2025-12-04T11:11:07.1185191Z [command]/usr/bin/git reset --hard HEAD 2025-12-04T11:11:07.1984880Z HEAD is now at c0cb6e784044 [DTensor] ExplicitRedistributionContext warning mode (#169452) 2025-12-04T11:11:07.2046028Z ##[endgroup] 2025-12-04T11:11:07.2047120Z ##[group]Disabling automatic garbage collection 2025-12-04T11:11:07.2050293Z [command]/usr/bin/git config --local gc.auto 0 2025-12-04T11:11:07.2081912Z ##[endgroup] 2025-12-04T11:11:07.2082222Z ##[group]Setting up auth 2025-12-04T11:11:07.2085305Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T11:11:07.2102809Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T11:11:07.2296132Z Entering 'android/libs/fbjni' 2025-12-04T11:11:07.2323993Z Entering 'third_party/FP16' 2025-12-04T11:11:07.2350250Z Entering 'third_party/FXdiv' 2025-12-04T11:11:07.2376106Z Entering 'third_party/NNPACK' 2025-12-04T11:11:07.2401768Z Entering 'third_party/NVTX' 2025-12-04T11:11:07.2429556Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:07.2454672Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:07.2485632Z Entering 'third_party/aiter' 2025-12-04T11:11:07.2512680Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:07.2540697Z Entering 'third_party/benchmark' 2025-12-04T11:11:07.2566583Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:07.2594628Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:07.2620063Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:07.2644552Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:07.2668923Z Entering 'third_party/cutlass' 2025-12-04T11:11:07.2697018Z Entering 'third_party/fbgemm' 2025-12-04T11:11:07.2724124Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:07.2747501Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:07.2776404Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:07.2800760Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:07.2828759Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:07.2852463Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:07.2883948Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:07.2911954Z Entering 'third_party/flash-attention' 2025-12-04T11:11:07.2939190Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:07.2965805Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:07.2992988Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:07.3018934Z Entering 'third_party/fmt' 2025-12-04T11:11:07.3043574Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:07.3066934Z Entering 'third_party/gloo' 2025-12-04T11:11:07.3091476Z Entering 'third_party/googletest' 2025-12-04T11:11:07.3116957Z Entering 'third_party/ideep' 2025-12-04T11:11:07.3145555Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:07.3171817Z Entering 'third_party/ittapi' 2025-12-04T11:11:07.3196259Z Entering 'third_party/kineto' 2025-12-04T11:11:07.3222080Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:07.3245199Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:07.3268037Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:07.3292008Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:07.3315906Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:07.3341501Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:07.3366193Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:07.3390126Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:07.3413041Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:07.3436626Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:07.3460173Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:07.3487389Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:07.3511814Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:07.3540157Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:07.3563632Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:07.3588778Z Entering 'third_party/kleidiai' 2025-12-04T11:11:07.3613844Z Entering 'third_party/mimalloc' 2025-12-04T11:11:07.3637683Z Entering 'third_party/nlohmann' 2025-12-04T11:11:07.3662340Z Entering 'third_party/onnx' 2025-12-04T11:11:07.3693421Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:07.3719500Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:07.3744378Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:07.3767796Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:07.3792096Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:07.3815241Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:07.3839072Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:07.3861830Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:07.3886155Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:07.3911271Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:07.3936347Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:07.3961795Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:07.3991940Z Entering 'third_party/pocketfft' 2025-12-04T11:11:07.4022942Z Entering 'third_party/protobuf' 2025-12-04T11:11:07.4055825Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:07.4078669Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:07.4109085Z Entering 'third_party/psimd' 2025-12-04T11:11:07.4134268Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:07.4159061Z Entering 'third_party/pybind11' 2025-12-04T11:11:07.4183167Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:07.4206663Z Entering 'third_party/sleef' 2025-12-04T11:11:07.4230190Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:07.4254797Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:07.4277589Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:07.4300603Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:07.4323907Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:07.4348384Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:07.4388111Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T11:11:07.4407707Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T11:11:07.4556521Z Entering 'android/libs/fbjni' 2025-12-04T11:11:07.4580883Z Entering 'third_party/FP16' 2025-12-04T11:11:07.4604628Z Entering 'third_party/FXdiv' 2025-12-04T11:11:07.4627423Z Entering 'third_party/NNPACK' 2025-12-04T11:11:07.4649936Z Entering 'third_party/NVTX' 2025-12-04T11:11:07.4673294Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:07.4696642Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:07.4726144Z Entering 'third_party/aiter' 2025-12-04T11:11:07.4758558Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:07.4795395Z Entering 'third_party/benchmark' 2025-12-04T11:11:07.4818782Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:07.4846047Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:07.4869012Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:07.4897007Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:07.4920030Z Entering 'third_party/cutlass' 2025-12-04T11:11:07.4948233Z Entering 'third_party/fbgemm' 2025-12-04T11:11:07.4973875Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:07.4996203Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:07.5021782Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:07.5046232Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:07.5073427Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:07.5097246Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:07.5120274Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:07.5145943Z Entering 'third_party/flash-attention' 2025-12-04T11:11:07.5169359Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:07.5193975Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:07.5223181Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:07.5249171Z Entering 'third_party/fmt' 2025-12-04T11:11:07.5272235Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:07.5295539Z Entering 'third_party/gloo' 2025-12-04T11:11:07.5318899Z Entering 'third_party/googletest' 2025-12-04T11:11:07.5341406Z Entering 'third_party/ideep' 2025-12-04T11:11:07.5374300Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:07.5408103Z Entering 'third_party/ittapi' 2025-12-04T11:11:07.5433628Z Entering 'third_party/kineto' 2025-12-04T11:11:07.5457097Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:07.5480541Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:07.5504556Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:07.5528020Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:07.5554153Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:07.5578767Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:07.5609005Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:07.5632758Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:07.5658069Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:07.5685493Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:07.5709703Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:07.5733338Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:07.5757500Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:07.5787140Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:07.5810990Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:07.5836063Z Entering 'third_party/kleidiai' 2025-12-04T11:11:07.5861052Z Entering 'third_party/mimalloc' 2025-12-04T11:11:07.5884571Z Entering 'third_party/nlohmann' 2025-12-04T11:11:07.5910953Z Entering 'third_party/onnx' 2025-12-04T11:11:07.5942110Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:07.5968711Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:07.5992667Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:07.6016375Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:07.6039252Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:07.6067166Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:07.6095446Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:07.6119279Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:07.6143744Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:07.6167666Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:07.6190406Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:07.6216357Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:07.6252924Z Entering 'third_party/pocketfft' 2025-12-04T11:11:07.6277495Z Entering 'third_party/protobuf' 2025-12-04T11:11:07.6302239Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:07.6325255Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:07.6351127Z Entering 'third_party/psimd' 2025-12-04T11:11:07.6376916Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:07.6399712Z Entering 'third_party/pybind11' 2025-12-04T11:11:07.6424977Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:07.6450358Z Entering 'third_party/sleef' 2025-12-04T11:11:07.6476489Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:07.6507890Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:07.6532548Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:07.6558246Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:07.6581660Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:07.6605269Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:07.6645548Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.6667936Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T11:11:07.6840892Z Entering 'android/libs/fbjni' 2025-12-04T11:11:07.6852443Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T11:11:07.6866669Z Entering 'third_party/FP16' 2025-12-04T11:11:07.6881720Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T11:11:07.6891439Z Entering 'third_party/FXdiv' 2025-12-04T11:11:07.6905243Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T11:11:07.6916663Z Entering 'third_party/NNPACK' 2025-12-04T11:11:07.6929427Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T11:11:07.6937456Z Entering 'third_party/NVTX' 2025-12-04T11:11:07.6958944Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T11:11:07.6967620Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:07.6979370Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T11:11:07.6988933Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:07.7001537Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T11:11:07.7017196Z Entering 'third_party/aiter' 2025-12-04T11:11:07.7028778Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T11:11:07.7039036Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:07.7048902Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T11:11:07.7066819Z Entering 'third_party/benchmark' 2025-12-04T11:11:07.7078411Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:07.7087101Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:07.7098050Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T11:11:07.7114838Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:07.7125440Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T11:11:07.7135549Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:07.7149238Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T11:11:07.7159184Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:07.7172323Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T11:11:07.7182684Z Entering 'third_party/cutlass' 2025-12-04T11:11:07.7195034Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T11:11:07.7208300Z Entering 'third_party/fbgemm' 2025-12-04T11:11:07.7220417Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T11:11:07.7231180Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:07.7259142Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T11:11:07.7282489Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:07.7309404Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T11:11:07.7337672Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:07.7354918Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T11:11:07.7383252Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:07.7402111Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T11:11:07.7426539Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:07.7445488Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T11:11:07.7455393Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:07.7472290Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T11:11:07.7486025Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:07.7500449Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T11:11:07.7519875Z Entering 'third_party/flash-attention' 2025-12-04T11:11:07.7535546Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T11:11:07.7549276Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:07.7563747Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T11:11:07.7581462Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:07.7594814Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T11:11:07.7616470Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:07.7631003Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T11:11:07.7647337Z Entering 'third_party/fmt' 2025-12-04T11:11:07.7661268Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T11:11:07.7675962Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:07.7694189Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T11:11:07.7707677Z Entering 'third_party/gloo' 2025-12-04T11:11:07.7725662Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T11:11:07.7739172Z Entering 'third_party/googletest' 2025-12-04T11:11:07.7756128Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:07.7768449Z Entering 'third_party/ideep' 2025-12-04T11:11:07.7784905Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T11:11:07.7796485Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:07.7809820Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T11:11:07.7827694Z Entering 'third_party/ittapi' 2025-12-04T11:11:07.7842419Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T11:11:07.7853907Z Entering 'third_party/kineto' 2025-12-04T11:11:07.7867077Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T11:11:07.7880685Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:07.7897557Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T11:11:07.7910039Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:07.7926949Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T11:11:07.7940043Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:07.7953148Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T11:11:07.7965453Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:07.7980764Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T11:11:07.7992437Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:07.8005763Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T11:11:07.8018129Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:07.8032733Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T11:11:07.8047487Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:07.8060884Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T11:11:07.8073946Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:07.8087241Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:07.8098355Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:07.8111623Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T11:11:07.8123752Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:07.8136731Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T11:11:07.8147981Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:07.8160931Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T11:11:07.8173746Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:07.8192273Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T11:11:07.8204405Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:07.8218101Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T11:11:07.8235611Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:07.8250697Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T11:11:07.8263314Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:07.8276845Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T11:11:07.8295387Z Entering 'third_party/kleidiai' 2025-12-04T11:11:07.8311254Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T11:11:07.8326097Z Entering 'third_party/mimalloc' 2025-12-04T11:11:07.8341937Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T11:11:07.8355783Z Entering 'third_party/nlohmann' 2025-12-04T11:11:07.8370031Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T11:11:07.8384029Z Entering 'third_party/onnx' 2025-12-04T11:11:07.8401463Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T11:11:07.8423275Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:07.8438896Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:07.8456843Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:07.8472161Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T11:11:07.8488434Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:07.8500997Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:07.8513343Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:07.8527066Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:07.8539202Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:07.8552920Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T11:11:07.8565572Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:07.8577917Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T11:11:07.8591545Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:07.8605929Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T11:11:07.8616354Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:07.8631005Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T11:11:07.8642963Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:07.8657050Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T11:11:07.8669630Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:07.8684357Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T11:11:07.8697071Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:07.8711919Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T11:11:07.8733834Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:07.8747538Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T11:11:07.8773443Z Entering 'third_party/pocketfft' 2025-12-04T11:11:07.8790364Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T11:11:07.8805312Z Entering 'third_party/protobuf' 2025-12-04T11:11:07.8826515Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T11:11:07.8841619Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:07.8858745Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:07.8872071Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:07.8885654Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:07.8904840Z Entering 'third_party/psimd' 2025-12-04T11:11:07.8921665Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T11:11:07.8938136Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:07.8953395Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T11:11:07.8968696Z Entering 'third_party/pybind11' 2025-12-04T11:11:07.8986306Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:07.9000792Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:07.9017772Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T11:11:07.9033794Z Entering 'third_party/sleef' 2025-12-04T11:11:07.9050772Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T11:11:07.9065287Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:07.9082920Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T11:11:07.9095728Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:07.9111158Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:07.9123807Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:07.9136657Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T11:11:07.9151344Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:07.9163977Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T11:11:07.9174025Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:07.9183550Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:07.9191304Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:07.9201221Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T11:11:07.9228143Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9251858Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9272067Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9287848Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9304266Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9323553Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9338660Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9352314Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9368028Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9382011Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9396349Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9410174Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9427755Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9441846Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9457138Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9472666Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9487816Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9502740Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9516814Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9530473Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9544145Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9557966Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9572032Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9586178Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9599475Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9612392Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9628227Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9642051Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9656634Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9670664Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9684779Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9698932Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9713312Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9726851Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9743085Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9757732Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9774055Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9795351Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9809383Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9822339Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9836555Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9850432Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9873176Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9888963Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9917131Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9936920Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9963365Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:07.9990289Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0018757Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0034222Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0052486Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0068205Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0084684Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0101020Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0121607Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0137697Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0153811Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0170535Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0186183Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0201664Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0216235Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0229748Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0245212Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0259445Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0274755Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0287940Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0301893Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0316419Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0330604Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0345534Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0359613Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0372471Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0396944Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0420539Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0435937Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0450125Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0464403Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0481707Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0490076Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0510559Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0522739Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:08.0546181Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T11:11:08.0579696Z ##[endgroup] 2025-12-04T11:11:08.0579986Z ##[group]Fetching the repository 2025-12-04T11:11:08.0583543Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-12-04T11:11:09.5880624Z From https://github.com/pytorch/pytorch 2025-12-04T11:11:09.5881189Z * [new branch] 2.6.0.dev20241004+ -> origin/2.6.0.dev20241004+ 2025-12-04T11:11:09.5881722Z * [new branch] 2.9.1 -> origin/2.9.1 2025-12-04T11:11:09.5882326Z * [new branch] AaronWang04_addmmfusion_perftest -> origin/AaronWang04_addmmfusion_perftest 2025-12-04T11:11:09.5882954Z * [new branch] Flamefire-patch-1 -> origin/Flamefire-patch-1 2025-12-04T11:11:09.5883549Z * [new branch] HDCharles-2.6.0-release-notes -> origin/HDCharles-2.6.0-release-notes 2025-12-04T11:11:09.5884121Z * [new branch] HOPrintFunc -> origin/HOPrintFunc 2025-12-04T11:11:09.5884628Z * [new branch] IvanKobzarev/stack/1 -> origin/IvanKobzarev/stack/1 2025-12-04T11:11:09.5885144Z * [new branch] NicoshevSVE128 -> origin/NicoshevSVE128 2025-12-04T11:11:09.5885667Z * [new branch] PR-AOTInductorNoneBug -> origin/PR-AOTInductorNoneBug 2025-12-04T11:11:09.5886240Z * [new branch] PR-AOTInductorNoneBugFix -> origin/PR-AOTInductorNoneBugFix 2025-12-04T11:11:09.5886816Z * [new branch] PR-FixConfigsIssue -> origin/PR-FixConfigsIssue 2025-12-04T11:11:09.5887348Z * [new branch] PR-NoneBugFix-viable -> origin/PR-NoneBugFix-viable 2025-12-04T11:11:09.5887852Z * [new branch] PR-ResetToZero -> origin/PR-ResetToZero 2025-12-04T11:11:09.5888480Z * [new branch] Update-Flash-Packaging -> origin/Update-Flash-Packaging 2025-12-04T11:11:09.5889004Z * [new branch] VLA_exp -> origin/VLA_exp 2025-12-04T11:11:09.5889473Z * [new branch] activation_bench -> origin/activation_bench 2025-12-04T11:11:09.5889975Z * [new branch] addmm-heuristic -> origin/addmm-heuristic 2025-12-04T11:11:09.5890476Z * [new branch] adi/onednn_aarch64 -> origin/adi/onednn_aarch64 2025-12-04T11:11:09.5890956Z * [new branch] adi/test -> origin/adi/test 2025-12-04T11:11:09.5891292Z * [new branch] adi/test_bgemm -> origin/adi/test_bgemm 2025-12-04T11:11:09.5891466Z * [new branch] adi/test_m8g -> origin/adi/test_m8g 2025-12-04T11:11:09.5891635Z * [new branch] adi/test_onednn -> origin/adi/test_onednn 2025-12-04T11:11:09.5891815Z * [new branch] adi/test_onednn_v3.9 -> origin/adi/test_onednn_v3.9 2025-12-04T11:11:09.5892006Z * [new branch] adi/test_presve_change -> origin/adi/test_presve_change 2025-12-04T11:11:09.5892187Z * [new branch] adi/test_timm -> origin/adi/test_timm 2025-12-04T11:11:09.5892911Z * [new branch] adi/testpresve_change -> origin/adi/testpresve_change 2025-12-04T11:11:09.5893110Z * [new branch] aditew01/test/vec_bf16 -> origin/aditew01/test/vec_bf16 2025-12-04T11:11:09.5893307Z * [new branch] ah-globalfeedback-hook -> origin/ah-globalfeedback-hook 2025-12-04T11:11:09.5893625Z * [new branch] albanD-patch-1 -> origin/albanD-patch-1 2025-12-04T11:11:09.5893815Z * [new branch] also-surround-shimh -> origin/also-surround-shimh 2025-12-04T11:11:09.5894006Z * [new branch] angelayi/aot_compile -> origin/angelayi/aot_compile 2025-12-04T11:11:09.5894236Z * [new branch] angelayi/aoti_additional_files -> origin/angelayi/aoti_additional_files 2025-12-04T11:11:09.5894453Z * [new branch] angelayi/benchmark -> origin/angelayi/benchmark 2025-12-04T11:11:09.5894678Z * [new branch] angelayi/change_pytree_serialization -> origin/angelayi/change_pytree_serialization 2025-12-04T11:11:09.5894916Z * [new branch] angelayi/cpp_loader -> origin/angelayi/cpp_loader 2025-12-04T11:11:09.5895111Z * [new branch] angelayi/inductor_const -> origin/angelayi/inductor_const 2025-12-04T11:11:09.5895299Z * [new branch] angelayi/lstm -> origin/angelayi/lstm 2025-12-04T11:11:09.5895481Z * [new branch] angelayi/no_so_weight -> origin/angelayi/no_so_weight 2025-12-04T11:11:09.5895671Z * [new branch] angelayi/scan_layers -> origin/angelayi/scan_layers 2025-12-04T11:11:09.5895853Z * [new branch] angelayi/side_eff -> origin/angelayi/side_eff 2025-12-04T11:11:09.5896036Z * [new branch] angelayi/state_dict -> origin/angelayi/state_dict 2025-12-04T11:11:09.5896231Z * [new branch] angelayi/symint_input -> origin/angelayi/symint_input 2025-12-04T11:11:09.5896416Z * [new branch] angelayi/symm_mem -> origin/angelayi/symm_mem 2025-12-04T11:11:09.5896597Z * [new branch] angelayi/test_cpp -> origin/angelayi/test_cpp 2025-12-04T11:11:09.5896780Z * [new branch] angelayi/torch_size -> origin/angelayi/torch_size 2025-12-04T11:11:09.5896965Z * [new branch] annotate_assert -> origin/annotate_assert 2025-12-04T11:11:09.5897162Z * [new branch] annotate_fallback_kernel -> origin/annotate_fallback_kernel 2025-12-04T11:11:09.5897364Z * [new branch] annotation_deepcopy -> origin/annotation_deepcopy 2025-12-04T11:11:09.5897563Z * [new branch] annotation_dynamo -> origin/annotation_dynamo 2025-12-04T11:11:09.5897755Z * [new branch] aot_eager_stack_trace -> origin/aot_eager_stack_trace 2025-12-04T11:11:09.5897941Z * [new branch] aoti-cuda-alloc -> origin/aoti-cuda-alloc 2025-12-04T11:11:09.5898124Z * [new branch] aoti_const_device -> origin/aoti_const_device 2025-12-04T11:11:09.5898369Z * [new branch] aoti_fqn_name_interface -> origin/aoti_fqn_name_interface 2025-12-04T11:11:09.5898586Z * [new branch] aoti_package_weights_binary -> origin/aoti_package_weights_binary 2025-12-04T11:11:09.5898799Z * [new branch] aoti_target_windows -> origin/aoti_target_windows 2025-12-04T11:11:09.5899029Z * [new branch] arsh/feat/inductor_check_profiling -> origin/arsh/feat/inductor_check_profiling 2025-12-04T11:11:09.5899249Z * [new branch] async_tp -> origin/async_tp 2025-12-04T11:11:09.5899459Z * [new branch] atalman-inductor-perf-cu124 -> origin/atalman-inductor-perf-cu124 2025-12-04T11:11:09.5899723Z * [new branch] atalman-inductor-perf-cu124.1 -> origin/atalman-inductor-perf-cu124.1 2025-12-04T11:11:09.5899946Z * [new branch] atalman-patch-2 -> origin/atalman-patch-2 2025-12-04T11:11:09.5900131Z * [new branch] atalman-patch-3 -> origin/atalman-patch-3 2025-12-04T11:11:09.5900362Z * [new branch] atalman-patch-4 -> origin/atalman-patch-4 2025-12-04T11:11:09.5900544Z * [new branch] atalman-patch-5 -> origin/atalman-patch-5 2025-12-04T11:11:09.5900837Z * [new branch] atalman-patch-6 -> origin/atalman-patch-6 2025-12-04T11:11:09.5901019Z * [new branch] atalman-patch-7 -> origin/atalman-patch-7 2025-12-04T11:11:09.5902085Z * [new branch] atalman-patch-8 -> origin/atalman-patch-8 2025-12-04T11:11:09.5902377Z * [new branch] atalman_inductor_2.3.1 -> origin/atalman_inductor_2.3.1 2025-12-04T11:11:09.5902612Z * [new branch] atalman_inductor_2.4.0 -> origin/atalman_inductor_2.4.0 2025-12-04T11:11:09.5902810Z * [new branch] atalman_inductor_2.4.x -> origin/atalman_inductor_2.4.x 2025-12-04T11:11:09.5903086Z * [new branch] attention_benchmarking_clean -> origin/attention_benchmarking_clean 2025-12-04T11:11:09.5903318Z * [new branch] bahuang/dt_fix_scalar_add -> origin/bahuang/dt_fix_scalar_add 2025-12-04T11:11:09.5903530Z * [new branch] bahuang/fix_debug_mode -> origin/bahuang/fix_debug_mode 2025-12-04T11:11:09.5903740Z * [new branch] bahuang/fix_expand -> origin/bahuang/fix_expand 2025-12-04T11:11:09.5903925Z * [new branch] bahuang/test -> origin/bahuang/test 2025-12-04T11:11:09.5904099Z * [new branch] base/1.5 -> origin/base/1.5 2025-12-04T11:11:09.5904311Z * [new branch] batching_sdpa_efficient_attention -> origin/batching_sdpa_efficient_attention 2025-12-04T11:11:09.5904544Z * [new branch] bench_scaled_mm_ops -> origin/bench_scaled_mm_ops 2025-12-04T11:11:09.5904748Z * [new branch] benchmark-updates -> origin/benchmark-updates 2025-12-04T11:11:09.5904954Z * [new branch] benchmarking-script -> origin/benchmarking-script 2025-12-04T11:11:09.5905154Z * [new branch] bertmaher/pinbump26 -> origin/bertmaher/pinbump26 2025-12-04T11:11:09.5905343Z * [new branch] bertrand/cutlass -> origin/bertrand/cutlass 2025-12-04T11:11:09.5905537Z * [new branch] bf/bug-static-input -> origin/bf/bug-static-input 2025-12-04T11:11:09.5905731Z * [new branch] bf/cg-backend -> origin/bf/cg-backend 2025-12-04T11:11:09.5905915Z * [new branch] bf/cg-nccl-test -> origin/bf/cg-nccl-test 2025-12-04T11:11:09.5906096Z * [new branch] bf/cg-remove-check -> origin/bf/cg-remove-check 2025-12-04T11:11:09.5906298Z * [new branch] bf/clean-torchbench-hf -> origin/bf/clean-torchbench-hf 2025-12-04T11:11:09.5906500Z * [new branch] bf/combo-debug-log -> origin/bf/combo-debug-log 2025-12-04T11:11:09.5906690Z * [new branch] bf/cudagraph -> origin/bf/cudagraph 2025-12-04T11:11:09.5906933Z * [new branch] bf/cudagraph-disable-input-mutation -> origin/bf/cudagraph-disable-input-mutation 2025-12-04T11:11:09.5907296Z * [new branch] bf/cudagraph-enable-input-mutation-support-benchmark -> origin/bf/cudagraph-enable-input-mutation-support-benchmark 2025-12-04T11:11:09.5907618Z * [new branch] bf/cudagraph-partition -> origin/bf/cudagraph-partition 2025-12-04T11:11:09.5907905Z * [new branch] bf/donated-buffer-bench -> origin/bf/donated-buffer-bench 2025-12-04T11:11:09.5908110Z * [new branch] bf/dynamo-partition -> origin/bf/dynamo-partition 2025-12-04T11:11:09.5908328Z * [new branch] bf/lite -> origin/bf/lite 2025-12-04T11:11:09.5908516Z * [new branch] bf/pa-non-divisible -> origin/bf/pa-non-divisible 2025-12-04T11:11:09.5908750Z * [new branch] bf/partition-cache-free-symbols -> origin/bf/partition-cache-free-symbols 2025-12-04T11:11:09.5909356Z * [new branch] bf/partition-memory-plan -> origin/bf/partition-memory-plan 2025-12-04T11:11:09.5909572Z * [new branch] bf/partition-move-cpu -> origin/bf/partition-move-cpu 2025-12-04T11:11:09.5909928Z * [new branch] bf/partition-view-fallback -> origin/bf/partition-view-fallback 2025-12-04T11:11:09.5910147Z * [new branch] bf/remove-check-55b0c39d -> origin/bf/remove-check-55b0c39d 2025-12-04T11:11:09.5910358Z * [new branch] bf/timm-nov-26-2025 -> origin/bf/timm-nov-26-2025 2025-12-04T11:11:09.5910572Z * [new branch] bf/transformer-pin-4-57-3 -> origin/bf/transformer-pin-4-57-3 2025-12-04T11:11:09.5910798Z * [new branch] bisect_perf_hf_T5_3acc6eac492 -> origin/bisect_perf_hf_T5_3acc6eac492 2025-12-04T11:11:09.5911026Z * [new branch] bisect_perf_hf_T5_3fcf66f61fb -> origin/bisect_perf_hf_T5_3fcf66f61fb 2025-12-04T11:11:09.5911251Z * [new branch] bisect_perf_hf_T5_4009d154129 -> origin/bisect_perf_hf_T5_4009d154129 2025-12-04T11:11:09.5911464Z * [new branch] bisect_perf_hf_T5_40d0740e73d -> origin/bisect_perf_hf_T5_40d0740e73d 2025-12-04T11:11:09.5911683Z * [new branch] bisect_perf_hf_T5_5268754e -> origin/bisect_perf_hf_T5_5268754e 2025-12-04T11:11:09.5911897Z * [new branch] bisect_perf_hf_T5_7d89a8d385c -> origin/bisect_perf_hf_T5_7d89a8d385c 2025-12-04T11:11:09.5912121Z * [new branch] bisect_perf_hf_T5_b7a25c1ee7c -> origin/bisect_perf_hf_T5_b7a25c1ee7c 2025-12-04T11:11:09.5912340Z * [new branch] bisect_perf_hf_T5_c25b201583f -> origin/bisect_perf_hf_T5_c25b201583f 2025-12-04T11:11:09.5912555Z * [new branch] bisect_perf_hf_T5_c93e57efac0 -> origin/bisect_perf_hf_T5_c93e57efac0 2025-12-04T11:11:09.5912767Z * [new branch] bisect_perf_hf_T5_ca9813ea149 -> origin/bisect_perf_hf_T5_ca9813ea149 2025-12-04T11:11:09.5912984Z * [new branch] bisect_perf_hf_T5_d65f194a -> origin/bisect_perf_hf_T5_d65f194a 2025-12-04T11:11:09.5913187Z * [new branch] bisect_perf_hf_T5_da94ab0b -> origin/bisect_perf_hf_T5_da94ab0b 2025-12-04T11:11:09.5913409Z * [new branch] bisect_perf_hf_T5_da94ab0b_new -> origin/bisect_perf_hf_T5_da94ab0b_new 2025-12-04T11:11:09.5913632Z * [new branch] bisect_perf_hf_T5_db4e8a1d8a8 -> origin/bisect_perf_hf_T5_db4e8a1d8a8 2025-12-04T11:11:09.5913856Z * [new branch] bisect_perf_hf_T5_e0d97e936a2 -> origin/bisect_perf_hf_T5_e0d97e936a2 2025-12-04T11:11:09.5914067Z * [new branch] bisect_perf_hf_T5_f23621ec563 -> origin/bisect_perf_hf_T5_f23621ec563 2025-12-04T11:11:09.5914275Z * [new branch] brister/fx_device_type -> origin/brister/fx_device_type 2025-12-04T11:11:09.5914492Z * [new branch] brister/test_inductor_all_fx -> origin/brister/test_inductor_all_fx 2025-12-04T11:11:09.5914747Z * [new branch] brister/tiled_reduction_no_numel_check -> origin/brister/tiled_reduction_no_numel_check 2025-12-04T11:11:09.5914978Z * [new branch] bwd-backup -> origin/bwd-backup 2025-12-04T11:11:09.5915152Z * [new branch] c57382a49 -> origin/c57382a49 2025-12-04T11:11:09.5915322Z * [new branch] ca_0431d47eaa -> origin/ca_0431d47eaa 2025-12-04T11:11:09.5915498Z * [new branch] ca_fix_0431d47eaa -> origin/ca_fix_0431d47eaa 2025-12-04T11:11:09.5915701Z * [new branch] camyllh/test_setup_hooks_push -> origin/camyllh/test_setup_hooks_push 2025-12-04T11:11:09.5915921Z * [new branch] cccclai-patch-1 -> origin/cccclai-patch-1 2025-12-04T11:11:09.5916208Z * [new branch] cherry-pick-159969-by-pytorch_bot_bot_ -> origin/cherry-pick-159969-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5916518Z * [new branch] cherry-pick-160586-by-pytorch_bot_bot_ -> origin/cherry-pick-160586-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5916794Z * [new branch] cherry-pick-162208-by-pytorch_bot_bot_ -> origin/cherry-pick-162208-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5917150Z * [new branch] cherry-pick-163169-by-pytorch_bot_bot_ -> origin/cherry-pick-163169-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5917428Z * [new branch] cherry-pick-165086-by-pytorch_bot_bot_ -> origin/cherry-pick-165086-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5917700Z * [new branch] cherry-pick-165514-by-pytorch_bot_bot_ -> origin/cherry-pick-165514-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5917976Z * [new branch] cherry-pick-165601-by-pytorch_bot_bot_ -> origin/cherry-pick-165601-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5918297Z * [new branch] cherry-pick-165667-by-pytorch_bot_bot_ -> origin/cherry-pick-165667-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5918574Z * [new branch] cherry-pick-165815-by-pytorch_bot_bot_ -> origin/cherry-pick-165815-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5918854Z * [new branch] cherry-pick-165922-by-pytorch_bot_bot_ -> origin/cherry-pick-165922-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5919135Z * [new branch] cherry-pick-166148-by-pytorch_bot_bot_ -> origin/cherry-pick-166148-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5919415Z * [new branch] cherry-pick-166181-by-pytorch_bot_bot_ -> origin/cherry-pick-166181-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5919692Z * [new branch] cherry-pick-166404-by-pytorch_bot_bot_ -> origin/cherry-pick-166404-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5919969Z * [new branch] cherry-pick-166427-by-pytorch_bot_bot_ -> origin/cherry-pick-166427-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5920241Z * [new branch] cherry-pick-166480-by-pytorch_bot_bot_ -> origin/cherry-pick-166480-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5920522Z * [new branch] cherry-pick-166570-by-pytorch_bot_bot_ -> origin/cherry-pick-166570-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5920799Z * [new branch] cherry-pick-166993-by-pytorch_bot_bot_ -> origin/cherry-pick-166993-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5921110Z * [new branch] cherry-pick-167111-by-pytorch_bot_bot_ -> origin/cherry-pick-167111-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5921384Z * [new branch] cherry-pick-167478-by-pytorch_bot_bot_ -> origin/cherry-pick-167478-by-pytorch_bot_bot_ 2025-12-04T11:11:09.5921628Z * [new branch] cherry_pick_166036_166040 -> origin/cherry_pick_166036_166040 2025-12-04T11:11:09.5921825Z * [new branch] cherry_pick_166457 -> origin/cherry_pick_166457 2025-12-04T11:11:09.5922009Z * [new branch] cherrypick_166338 -> origin/cherrypick_166338 2025-12-04T11:11:09.5922193Z * [new branch] cherrypick_166458 -> origin/cherrypick_166458 2025-12-04T11:11:09.5922378Z * [new branch] cherrypick_166586 -> origin/cherrypick_166586 2025-12-04T11:11:09.5922557Z * [new branch] cherrypick_166956 -> origin/cherrypick_166956 2025-12-04T11:11:09.5922733Z * [new branch] ci_attn -> origin/ci_attn 2025-12-04T11:11:09.5922909Z * [new branch] codex-testing -> origin/codex-testing 2025-12-04T11:11:09.5923288Z * [new branch] codex/add-check_memory_overlap-helper-functions -> origin/codex/add-check_memory_overlap-helper-functions 2025-12-04T11:11:09.5923597Z * [new branch] codex/fix-issue-121219-in-pytorch -> origin/codex/fix-issue-121219-in-pytorch 2025-12-04T11:11:09.5923923Z * [new branch] codex/investigate-segfaults-in-get_tensor_storage_id -> origin/codex/investigate-segfaults-in-get_tensor_storage_id 2025-12-04T11:11:09.5924294Z * [new branch] codex/refactor-lintrunner-config-to-use-uv-run -> origin/codex/refactor-lintrunner-config-to-use-uv-run 2025-12-04T11:11:09.5924608Z * [new branch] compatiblpy39util -> origin/compatiblpy39util 2025-12-04T11:11:09.5924796Z * [new branch] cond_hop_device -> origin/cond_hop_device 2025-12-04T11:11:09.5925007Z * [new branch] context_test -> origin/context_test 2025-12-04T11:11:09.5925250Z * [new branch] copilot/code-style-cleanup-python-pip -> origin/copilot/code-style-cleanup-python-pip 2025-12-04T11:11:09.5925500Z * [new branch] cpio/fix_new_ami_tests -> origin/cpio/fix_new_ami_tests 2025-12-04T11:11:09.5925722Z * [new branch] cpp-docs-dependency-upgrade -> origin/cpp-docs-dependency-upgrade 2025-12-04T11:11:09.5926004Z * [new branch] crpa/typo-in-inductor_comm_lowering -> origin/crpa/typo-in-inductor_comm_lowering 2025-12-04T11:11:09.5926238Z * [new branch] csl/always_produce_xml -> origin/csl/always_produce_xml 2025-12-04T11:11:09.5926446Z * [new branch] csl/build_test_more_procs -> origin/csl/build_test_more_procs 2025-12-04T11:11:09.5926661Z * [new branch] csl/build_test_more_procs2 -> origin/csl/build_test_more_procs2 2025-12-04T11:11:09.5926857Z * [new branch] csl/clean_up -> origin/csl/clean_up 2025-12-04T11:11:09.5927051Z * [new branch] csl/fix_retry_segfault_exit -> origin/csl/fix_retry_segfault_exit 2025-12-04T11:11:09.5927248Z * [new branch] csl/katex -> origin/csl/katex 2025-12-04T11:11:09.5927430Z * [new branch] csl/larger_runner -> origin/csl/larger_runner 2025-12-04T11:11:09.5927610Z * [new branch] csl/lint_testing -> origin/csl/lint_testing 2025-12-04T11:11:09.5927793Z * [new branch] csl/lint_thing -> origin/csl/lint_thing 2025-12-04T11:11:09.5927981Z * [new branch] csl/lintrunner_stuff -> origin/csl/lintrunner_stuff 2025-12-04T11:11:09.5928230Z * [new branch] csl/manually_gen_json -> origin/csl/manually_gen_json 2025-12-04T11:11:09.5928420Z * [new branch] csl/mps_sharding -> origin/csl/mps_sharding 2025-12-04T11:11:09.5928614Z * [new branch] csl/multistage_docker -> origin/csl/multistage_docker 2025-12-04T11:11:09.5928799Z * [new branch] csl/print_timing -> origin/csl/print_timing 2025-12-04T11:11:09.5928987Z * [new branch] csl/remove_experiment -> origin/csl/remove_experiment 2025-12-04T11:11:09.5929191Z * [new branch] csl/remove_maybe_unused_var -> origin/csl/remove_maybe_unused_var 2025-12-04T11:11:09.5929432Z * [new branch] csl/remove_repo_specific_autolabel -> origin/csl/remove_repo_specific_autolabel 2025-12-04T11:11:09.5929661Z * [new branch] csl/remove_run_parallel -> origin/csl/remove_run_parallel 2025-12-04T11:11:09.5929855Z * [new branch] csl/remove_unused_vars -> origin/csl/remove_unused_vars 2025-12-04T11:11:09.5930050Z * [new branch] csl/revert_open -> origin/csl/revert_open 2025-12-04T11:11:09.5930228Z * [new branch] csl/skip_build -> origin/csl/skip_build 2025-12-04T11:11:09.5930427Z * [new branch] csl/smaller_avx_amx_runenrs -> origin/csl/smaller_avx_amx_runenrs 2025-12-04T11:11:09.5930628Z * [new branch] csl/td_job_level -> origin/csl/td_job_level 2025-12-04T11:11:09.5930840Z * [new branch] csl/test_cuda_build_large_runner -> origin/csl/test_cuda_build_large_runner 2025-12-04T11:11:09.5931092Z * [new branch] csl/test_owners_autograd_dispatch_nn -> origin/csl/test_owners_autograd_dispatch_nn 2025-12-04T11:11:09.5931350Z * [new branch] csl/test_owners_higher_confidence -> origin/csl/test_owners_higher_confidence 2025-12-04T11:11:09.5931579Z * [new branch] csl/upload_json_running -> origin/csl/upload_json_running 2025-12-04T11:11:09.5931801Z * [new branch] csl/win_sccache -> origin/csl/win_sccache 2025-12-04T11:11:09.5931978Z * [new branch] csl/xml_stuff -> origin/csl/xml_stuff 2025-12-04T11:11:09.5932189Z * [new branch] cublasrelax2 -> origin/cublasrelax2 2025-12-04T11:11:09.5932361Z * [new branch] cuda_mempool -> origin/cuda_mempool 2025-12-04T11:11:09.5932548Z * [new branch] custom_lowering_dict -> origin/custom_lowering_dict 2025-12-04T11:11:09.5932753Z * [new branch] d4l3k/debug_plane_frtrace -> origin/d4l3k/debug_plane_frtrace 2025-12-04T11:11:09.5932943Z * [new branch] daxia6/2.8o3 -> origin/daxia6/2.8o3 2025-12-04T11:11:09.5933120Z * [new branch] debug-guard -> origin/debug-guard 2025-12-04T11:11:09.5933307Z * [new branch] delete-quant-docs -> origin/delete-quant-docs 2025-12-04T11:11:09.5933640Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 2025-12-04T11:11:09.5934099Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 2025-12-04T11:11:09.5934442Z * [new branch] desertfire/test_cpp_wrapper -> origin/desertfire/test_cpp_wrapper 2025-12-04T11:11:09.5934685Z * [new branch] desertfire/triton-cpu-for-aarch64 -> origin/desertfire/triton-cpu-for-aarch64 2025-12-04T11:11:09.5934924Z * [new branch] dev/dhruva/flex_attn_opt -> origin/dev/dhruva/flex_attn_opt 2025-12-04T11:11:09.5935134Z * [new branch] dev/joona/MPSNDArrayAdd -> origin/dev/joona/MPSNDArrayAdd 2025-12-04T11:11:09.5935330Z * [new branch] dev/joona/Unranked -> origin/dev/joona/Unranked 2025-12-04T11:11:09.5935535Z * [new branch] dev/joona/cat -> origin/dev/joona/cat 2025-12-04T11:11:09.5935724Z * [new branch] dev/joona/embeddingbag -> origin/dev/joona/embeddingbag 2025-12-04T11:11:09.5935934Z * [new branch] dev/joona/fix_sdpa_memtest -> origin/dev/joona/fix_sdpa_memtest 2025-12-04T11:11:09.5936155Z * [new branch] dev/joona/getTensorsString -> origin/dev/joona/getTensorsString 2025-12-04T11:11:09.5936383Z * [new branch] dev/joona/mps_linear_macos14 -> origin/dev/joona/mps_linear_macos14 2025-12-04T11:11:09.5936592Z * [new branch] dev/joona/scalar_clamp -> origin/dev/joona/scalar_clamp 2025-12-04T11:11:09.5936779Z * [new branch] dev/joona/sdpa -> origin/dev/joona/sdpa 2025-12-04T11:11:09.5936966Z * [new branch] dev/joona/sdpa_api -> origin/dev/joona/sdpa_api 2025-12-04T11:11:09.5937150Z * [new branch] dev/joona/type_inf -> origin/dev/joona/type_inf 2025-12-04T11:11:09.5937356Z * [new branch] dev/joona/ulpAssertClose -> origin/dev/joona/ulpAssertClose 2025-12-04T11:11:09.5937556Z * [new branch] dev/joona/upsize3d -> origin/dev/joona/upsize3d 2025-12-04T11:11:09.5937733Z * [new branch] disp_counter -> origin/disp_counter 2025-12-04T11:11:09.5937917Z * [new branch] divyanshk-patch-1 -> origin/divyanshk-patch-1 2025-12-04T11:11:09.5938097Z * [new branch] docs -> origin/docs 2025-12-04T11:11:09.5938298Z * [new branch] documentation -> origin/documentation 2025-12-04T11:11:09.5938486Z * [new branch] eager_model_benchmarks -> origin/eager_model_benchmarks 2025-12-04T11:11:09.5938704Z * [new branch] embg/test_inductor_ci_control -> origin/embg/test_inductor_ci_control 2025-12-04T11:11:09.5938950Z * [new branch] embg/triton_l2_prefetch_128B -> origin/embg/triton_l2_prefetch_128B 2025-12-04T11:11:09.5939253Z * [new branch] embg/triton_l2_prefetch_256B -> origin/embg/triton_l2_prefetch_256B 2025-12-04T11:11:09.5939457Z * [new branch] eqy-patch-1 -> origin/eqy-patch-1 2025-12-04T11:11:09.5939658Z * [new branch] eqy-patch-2 -> origin/eqy-patch-2 2025-12-04T11:11:09.5939833Z * [new branch] eqy-patch-3 -> origin/eqy-patch-3 2025-12-04T11:11:09.5940002Z * [new branch] eqy-patch-4 -> origin/eqy-patch-4 2025-12-04T11:11:09.5940167Z * [new branch] eqy-patch-5 -> origin/eqy-patch-5 2025-12-04T11:11:09.5940335Z * [new branch] eqy-patch-6 -> origin/eqy-patch-6 2025-12-04T11:11:09.5940516Z * [new branch] exclamaforte/amd-ma -> origin/exclamaforte/amd-ma 2025-12-04T11:11:09.5940758Z * [new branch] exclamaforte/combo-kernels-perf-run -> origin/exclamaforte/combo-kernels-perf-run 2025-12-04T11:11:09.5941023Z * [new branch] exclamaforte/do_bench_refactor -> origin/exclamaforte/do_bench_refactor 2025-12-04T11:11:09.5941274Z * [new branch] exclamaforte/enable-mem-dep-fusion -> origin/exclamaforte/enable-mem-dep-fusion 2025-12-04T11:11:09.5941568Z * [new branch] exclamaforte/fix-exhaustive-autotuning -> origin/exclamaforte/fix-exhaustive-autotuning 2025-12-04T11:11:09.5941868Z * [new branch] exclamaforte/fix-trace-parsing-fx-svg -> origin/exclamaforte/fix-trace-parsing-fx-svg 2025-12-04T11:11:09.5942175Z * [new branch] exclamaforte/force-pointwise-cat-perf-run -> origin/exclamaforte/force-pointwise-cat-perf-run 2025-12-04T11:11:09.5942444Z * [new branch] exclamaforte/fusion-data -> origin/exclamaforte/fusion-data 2025-12-04T11:11:09.5942682Z * [new branch] exclamaforte/gemm-benchmark-run -> origin/exclamaforte/gemm-benchmark-run 2025-12-04T11:11:09.5942937Z * [new branch] exclamaforte/gemm-export-model -> origin/exclamaforte/gemm-export-model 2025-12-04T11:11:09.5943165Z * [new branch] exclamaforte/gemm-model -> origin/exclamaforte/gemm-model 2025-12-04T11:11:09.5943443Z * [new branch] exclamaforte/gemm-model-all-data-collection -> origin/exclamaforte/gemm-model-all-data-collection 2025-12-04T11:11:09.5943712Z * [new branch] exclamaforte/gemm-to-amd -> origin/exclamaforte/gemm-to-amd 2025-12-04T11:11:09.5943941Z * [new branch] exclamaforte/just-gemm-model -> origin/exclamaforte/just-gemm-model 2025-12-04T11:11:09.5944214Z * [new branch] exclamaforte/just-gemm-model-no-refactor -> origin/exclamaforte/just-gemm-model-no-refactor 2025-12-04T11:11:09.5944490Z * [new branch] exclamaforte/profile-diff-algo -> origin/exclamaforte/profile-diff-algo 2025-12-04T11:11:09.5944756Z * [new branch] exclamaforte/profiler-visualization -> origin/exclamaforte/profiler-visualization 2025-12-04T11:11:09.5945030Z * [new branch] exclamaforte/test_cpp_wrapper_mode -> origin/exclamaforte/test_cpp_wrapper_mode 2025-12-04T11:11:09.5945300Z * [new branch] exclamaforte/update-autotune-configs -> origin/exclamaforte/update-autotune-configs 2025-12-04T11:11:09.5945596Z * [new branch] exclamaforte/update-autotune-configs-2 -> origin/exclamaforte/update-autotune-configs-2 2025-12-04T11:11:09.5945828Z * [new branch] exec -> origin/exec 2025-12-04T11:11:09.5946005Z * [new branch] experimental-mosaic -> origin/experimental-mosaic 2025-12-04T11:11:09.5946200Z * [new branch] export-D61047529 -> origin/export-D61047529 2025-12-04T11:11:09.5946425Z * [new branch] export-D71412006 -> origin/export-D71412006 2025-12-04T11:11:09.5946604Z * [new branch] export-D73042989 -> origin/export-D73042989 2025-12-04T11:11:09.5946817Z * [new branch] export-D78957093 -> origin/export-D78957093 2025-12-04T11:11:09.5946997Z * [new branch] export-D78996107 -> origin/export-D78996107 2025-12-04T11:11:09.5947169Z * [new branch] export-D80823877 -> origin/export-D80823877 2025-12-04T11:11:09.5947373Z * [new branch] export-D80958642 -> origin/export-D80958642 2025-12-04T11:11:09.5947549Z * [new branch] export-D81054193 -> origin/export-D81054193 2025-12-04T11:11:09.5947723Z * [new branch] export-D81204584 -> origin/export-D81204584 2025-12-04T11:11:09.5947902Z * [new branch] export-D81429090 -> origin/export-D81429090 2025-12-04T11:11:09.5948081Z * [new branch] export-D82250826 -> origin/export-D82250826 2025-12-04T11:11:09.5948302Z * [new branch] export-D82253817 -> origin/export-D82253817 2025-12-04T11:11:09.5948482Z * [new branch] export-D83541846 -> origin/export-D83541846 2025-12-04T11:11:09.5948661Z * [new branch] export-D83627170 -> origin/export-D83627170 2025-12-04T11:11:09.5948834Z * [new branch] export-D83766701 -> origin/export-D83766701 2025-12-04T11:11:09.5949013Z * [new branch] export-D83768878 -> origin/export-D83768878 2025-12-04T11:11:09.5949189Z * [new branch] export-D83769447 -> origin/export-D83769447 2025-12-04T11:11:09.5949361Z * [new branch] export-D84089824 -> origin/export-D84089824 2025-12-04T11:11:09.5949537Z * [new branch] export-D84213020 -> origin/export-D84213020 2025-12-04T11:11:09.5949708Z * [new branch] export-D84373821 -> origin/export-D84373821 2025-12-04T11:11:09.5949883Z * [new branch] export-D84612194 -> origin/export-D84612194 2025-12-04T11:11:09.5950062Z * [new branch] export-D84890985 -> origin/export-D84890985 2025-12-04T11:11:09.5950234Z * [new branch] export-D85122326 -> origin/export-D85122326 2025-12-04T11:11:09.5950414Z * [new branch] export-D86256198 -> origin/export-D86256198 2025-12-04T11:11:09.5950595Z * [new branch] export-D86460608 -> origin/export-D86460608 2025-12-04T11:11:09.5950767Z * [new branch] export-D86474796 -> origin/export-D86474796 2025-12-04T11:11:09.5950944Z * [new branch] export-D86712396 -> origin/export-D86712396 2025-12-04T11:11:09.5951121Z * [new branch] export-D87022129 -> origin/export-D87022129 2025-12-04T11:11:09.5951296Z * [new branch] export-D87838959 -> origin/export-D87838959 2025-12-04T11:11:09.5951474Z * [new branch] export-D88319437 -> origin/export-D88319437 2025-12-04T11:11:09.5951705Z * [new branch] exported-model-train-idempotent -> origin/exported-model-train-idempotent 2025-12-04T11:11:09.5951939Z * [new branch] ezyang-titan-october -> origin/ezyang-titan-october 2025-12-04T11:11:09.5952143Z * [new branch] ezyang-titan-october2 -> origin/ezyang-titan-october2 2025-12-04T11:11:09.5952336Z * [new branch] ezyang-war -> origin/ezyang-war 2025-12-04T11:11:09.5952535Z * [new branch] ezyang/wip-aot-descriptors -> origin/ezyang/wip-aot-descriptors 2025-12-04T11:11:09.5952737Z * [new branch] fa_u8_brgemm -> origin/fa_u8_brgemm 2025-12-04T11:11:09.5952934Z * [new branch] fadeputr/sequence_fbgemm -> origin/fadeputr/sequence_fbgemm 2025-12-04T11:11:09.5953128Z * [new branch] fastmath_baseline -> origin/fastmath_baseline 2025-12-04T11:11:09.5953310Z * [new branch] fbcode/warm -> origin/fbcode/warm 2025-12-04T11:11:09.5953501Z * [new branch] fca -> origin/fca 2025-12-04T11:11:09.5953707Z * [new branch] fca2_ca5984c -> origin/fca2_ca5984c 2025-12-04T11:11:09.5953873Z * [new branch] fca5 -> origin/fca5 2025-12-04T11:11:09.5954087Z * [new branch] feature/justknobs-cpp -> origin/feature/justknobs-cpp 2025-12-04T11:11:09.5954294Z * [new branch] feature/numa-forkserver -> origin/feature/numa-forkserver 2025-12-04T11:11:09.5954488Z * [new branch] ffast_math_baseline -> origin/ffast_math_baseline 2025-12-04T11:11:09.5954677Z * [new branch] ffast_math_target -> origin/ffast_math_target 2025-12-04T11:11:09.5954867Z * [new branch] findhao/base_commit -> origin/findhao/base_commit 2025-12-04T11:11:09.5955099Z * [new branch] findhao/base_commit1 -> origin/findhao/base_commit1 2025-12-04T11:11:09.5955296Z * [new branch] findhao/multistream2 -> origin/findhao/multistream2 2025-12-04T11:11:09.5955496Z * [new branch] findhao/multistream5 -> origin/findhao/multistream5 2025-12-04T11:11:09.5955685Z * [new branch] findhao/multistream6 -> origin/findhao/multistream6 2025-12-04T11:11:09.5955887Z * [new branch] findhao/operatorbench3 -> origin/findhao/operatorbench3 2025-12-04T11:11:09.5956089Z * [new branch] findhao/operatorbench5 -> origin/findhao/operatorbench5 2025-12-04T11:11:09.5956288Z * [new branch] findhao/tritonparse -> origin/findhao/tritonparse 2025-12-04T11:11:09.5956507Z * [new branch] fix-ck-gemm-template-format -> origin/fix-ck-gemm-template-format 2025-12-04T11:11:09.5956720Z * [new branch] fix-config-ignore -> origin/fix-config-ignore 2025-12-04T11:11:09.5956908Z * [new branch] fix-dict-guard -> origin/fix-dict-guard 2025-12-04T11:11:09.5957088Z * [new branch] fix_addmm_issue -> origin/fix_addmm_issue 2025-12-04T11:11:09.5957291Z * [new branch] fix_amd_missing_cluster_dims -> origin/fix_amd_missing_cluster_dims 2025-12-04T11:11:09.5957499Z * [new branch] fix_bench_bwd_pass -> origin/fix_bench_bwd_pass 2025-12-04T11:11:09.5957698Z * [new branch] fix_mem_profiler_config -> origin/fix_mem_profiler_config 2025-12-04T11:11:09.5957885Z * [new branch] fix_nvrtc_discovery -> origin/fix_nvrtc_discovery 2025-12-04T11:11:09.5958066Z * [new branch] fix_op_runner -> origin/fix_op_runner 2025-12-04T11:11:09.5958280Z * [new branch] fix_ubn_159469 -> origin/fix_ubn_159469 2025-12-04T11:11:09.5958451Z * [new branch] fixes-triage -> origin/fixes-triage 2025-12-04T11:11:09.5958631Z * [new branch] fixflashinfer -> origin/fixflashinfer 2025-12-04T11:11:09.5958816Z * [new branch] flash_decoding_cpu -> origin/flash_decoding_cpu 2025-12-04T11:11:09.5958996Z * [new branch] flex-flash -> origin/flex-flash 2025-12-04T11:11:09.5959204Z * [new branch] flex_attention_functorch_grad -> origin/flex_attention_functorch_grad 2025-12-04T11:11:09.5959413Z * [new branch] flex_flash -> origin/flex_flash 2025-12-04T11:11:09.5959617Z * [new branch] fmassa/fix_memeff_sharding_rule -> origin/fmassa/fix_memeff_sharding_rule 2025-12-04T11:11:09.5959869Z * [new branch] fmassa/tests_comm_compute_scheduler -> origin/fmassa/tests_comm_compute_scheduler 2025-12-04T11:11:09.5960093Z * [new branch] forkserver_fix -> origin/forkserver_fix 2025-12-04T11:11:09.5960273Z * [new branch] fsdp2_trace_rules -> origin/fsdp2_trace_rules 2025-12-04T11:11:09.5982399Z * [new branch] fx_cpp -> origin/fx_cpp 2025-12-04T11:11:09.5982639Z * [new branch] fy/fix-win -> origin/fy/fix-win 2025-12-04T11:11:09.5983818Z * [new branch] galv-patch-1 -> origin/galv-patch-1 2025-12-04T11:11:09.5984064Z * [new branch] galv/cudagraphs-conditional-nodes-4 -> origin/galv/cudagraphs-conditional-nodes-4 2025-12-04T11:11:09.5984364Z * [new branch] georgehong/cmakelists-patch -> origin/georgehong/cmakelists-patch 2025-12-04T11:11:09.5984589Z * [new branch] gh/AlnisM/1/base -> origin/gh/AlnisM/1/base 2025-12-04T11:11:09.5984782Z * [new branch] gh/AlnisM/1/head -> origin/gh/AlnisM/1/head 2025-12-04T11:11:09.5984974Z * [new branch] gh/EikanWang/67/base -> origin/gh/EikanWang/67/base 2025-12-04T11:11:09.5985177Z * [new branch] gh/EikanWang/67/head -> origin/gh/EikanWang/67/head 2025-12-04T11:11:09.5985380Z * [new branch] gh/Gasoonjia/1/base -> origin/gh/Gasoonjia/1/base 2025-12-04T11:11:09.5985577Z * [new branch] gh/Gasoonjia/1/head -> origin/gh/Gasoonjia/1/head 2025-12-04T11:11:09.5985774Z * [new branch] gh/H-Huang/131/base -> origin/gh/H-Huang/131/base 2025-12-04T11:11:09.5985967Z * [new branch] gh/H-Huang/131/head -> origin/gh/H-Huang/131/head 2025-12-04T11:11:09.5986157Z * [new branch] gh/H-Huang/131/orig -> origin/gh/H-Huang/131/orig 2025-12-04T11:11:09.5986348Z * [new branch] gh/H-Huang/132/base -> origin/gh/H-Huang/132/base 2025-12-04T11:11:09.5986539Z * [new branch] gh/H-Huang/132/head -> origin/gh/H-Huang/132/head 2025-12-04T11:11:09.5986722Z * [new branch] gh/H-Huang/132/orig -> origin/gh/H-Huang/132/orig 2025-12-04T11:11:09.5986912Z * [new branch] gh/H-Huang/180/base -> origin/gh/H-Huang/180/base 2025-12-04T11:11:09.5987095Z * [new branch] gh/H-Huang/180/head -> origin/gh/H-Huang/180/head 2025-12-04T11:11:09.5987291Z * [new branch] gh/H-Huang/180/orig -> origin/gh/H-Huang/180/orig 2025-12-04T11:11:09.5987480Z * [new branch] gh/H-Huang/182/base -> origin/gh/H-Huang/182/base 2025-12-04T11:11:09.5987664Z * [new branch] gh/H-Huang/182/head -> origin/gh/H-Huang/182/head 2025-12-04T11:11:09.5987857Z * [new branch] gh/H-Huang/182/orig -> origin/gh/H-Huang/182/orig 2025-12-04T11:11:09.5988046Z * [new branch] gh/H-Huang/226/base -> origin/gh/H-Huang/226/base 2025-12-04T11:11:09.5988273Z * [new branch] gh/H-Huang/226/head -> origin/gh/H-Huang/226/head 2025-12-04T11:11:09.5988461Z * [new branch] gh/H-Huang/226/orig -> origin/gh/H-Huang/226/orig 2025-12-04T11:11:09.5988649Z * [new branch] gh/H-Huang/228/base -> origin/gh/H-Huang/228/base 2025-12-04T11:11:09.5988834Z * [new branch] gh/H-Huang/228/head -> origin/gh/H-Huang/228/head 2025-12-04T11:11:09.5989029Z * [new branch] gh/H-Huang/228/orig -> origin/gh/H-Huang/228/orig 2025-12-04T11:11:09.5989235Z * [new branch] gh/IvanKobzarev/150/base -> origin/gh/IvanKobzarev/150/base 2025-12-04T11:11:09.5989446Z * [new branch] gh/IvanKobzarev/150/head -> origin/gh/IvanKobzarev/150/head 2025-12-04T11:11:09.5989662Z * [new branch] gh/IvanKobzarev/150/orig -> origin/gh/IvanKobzarev/150/orig 2025-12-04T11:11:09.5989874Z * [new branch] gh/IvanKobzarev/157/base -> origin/gh/IvanKobzarev/157/base 2025-12-04T11:11:09.5990079Z * [new branch] gh/IvanKobzarev/157/head -> origin/gh/IvanKobzarev/157/head 2025-12-04T11:11:09.5990288Z * [new branch] gh/IvanKobzarev/157/orig -> origin/gh/IvanKobzarev/157/orig 2025-12-04T11:11:09.5990498Z * [new branch] gh/IvanKobzarev/159/base -> origin/gh/IvanKobzarev/159/base 2025-12-04T11:11:09.5990703Z * [new branch] gh/IvanKobzarev/159/head -> origin/gh/IvanKobzarev/159/head 2025-12-04T11:11:09.5990953Z * [new branch] gh/IvanKobzarev/159/orig -> origin/gh/IvanKobzarev/159/orig 2025-12-04T11:11:09.5991165Z * [new branch] gh/IvanKobzarev/162/base -> origin/gh/IvanKobzarev/162/base 2025-12-04T11:11:09.5991406Z * [new branch] gh/IvanKobzarev/162/head -> origin/gh/IvanKobzarev/162/head 2025-12-04T11:11:09.5991617Z * [new branch] gh/IvanKobzarev/162/orig -> origin/gh/IvanKobzarev/162/orig 2025-12-04T11:11:09.5991827Z * [new branch] gh/IvanKobzarev/163/base -> origin/gh/IvanKobzarev/163/base 2025-12-04T11:11:09.5992029Z * [new branch] gh/IvanKobzarev/163/head -> origin/gh/IvanKobzarev/163/head 2025-12-04T11:11:09.5992240Z * [new branch] gh/IvanKobzarev/163/orig -> origin/gh/IvanKobzarev/163/orig 2025-12-04T11:11:09.5992443Z * [new branch] gh/IvanKobzarev/166/base -> origin/gh/IvanKobzarev/166/base 2025-12-04T11:11:09.5992660Z * [new branch] gh/IvanKobzarev/166/head -> origin/gh/IvanKobzarev/166/head 2025-12-04T11:11:09.5992870Z * [new branch] gh/IvanKobzarev/166/orig -> origin/gh/IvanKobzarev/166/orig 2025-12-04T11:11:09.5993078Z * [new branch] gh/IvanKobzarev/167/base -> origin/gh/IvanKobzarev/167/base 2025-12-04T11:11:09.5993291Z * [new branch] gh/IvanKobzarev/167/head -> origin/gh/IvanKobzarev/167/head 2025-12-04T11:11:09.5993502Z * [new branch] gh/IvanKobzarev/167/orig -> origin/gh/IvanKobzarev/167/orig 2025-12-04T11:11:09.5993706Z * [new branch] gh/IvanKobzarev/168/base -> origin/gh/IvanKobzarev/168/base 2025-12-04T11:11:09.5993916Z * [new branch] gh/IvanKobzarev/168/head -> origin/gh/IvanKobzarev/168/head 2025-12-04T11:11:09.5994125Z * [new branch] gh/IvanKobzarev/168/orig -> origin/gh/IvanKobzarev/168/orig 2025-12-04T11:11:09.5994329Z * [new branch] gh/IvanKobzarev/169/base -> origin/gh/IvanKobzarev/169/base 2025-12-04T11:11:09.5994541Z * [new branch] gh/IvanKobzarev/169/head -> origin/gh/IvanKobzarev/169/head 2025-12-04T11:11:09.5994750Z * [new branch] gh/IvanKobzarev/169/orig -> origin/gh/IvanKobzarev/169/orig 2025-12-04T11:11:09.5994958Z * [new branch] gh/IvanKobzarev/170/base -> origin/gh/IvanKobzarev/170/base 2025-12-04T11:11:09.5995167Z * [new branch] gh/IvanKobzarev/170/head -> origin/gh/IvanKobzarev/170/head 2025-12-04T11:11:09.5995377Z * [new branch] gh/IvanKobzarev/170/orig -> origin/gh/IvanKobzarev/170/orig 2025-12-04T11:11:09.5995580Z * [new branch] gh/IvanKobzarev/171/base -> origin/gh/IvanKobzarev/171/base 2025-12-04T11:11:09.5995789Z * [new branch] gh/IvanKobzarev/171/head -> origin/gh/IvanKobzarev/171/head 2025-12-04T11:11:09.5995998Z * [new branch] gh/IvanKobzarev/171/orig -> origin/gh/IvanKobzarev/171/orig 2025-12-04T11:11:09.5996203Z * [new branch] gh/IvanKobzarev/172/base -> origin/gh/IvanKobzarev/172/base 2025-12-04T11:11:09.5996414Z * [new branch] gh/IvanKobzarev/172/head -> origin/gh/IvanKobzarev/172/head 2025-12-04T11:11:09.5996624Z * [new branch] gh/IvanKobzarev/172/orig -> origin/gh/IvanKobzarev/172/orig 2025-12-04T11:11:09.5996829Z * [new branch] gh/IvanKobzarev/173/base -> origin/gh/IvanKobzarev/173/base 2025-12-04T11:11:09.5997036Z * [new branch] gh/IvanKobzarev/173/head -> origin/gh/IvanKobzarev/173/head 2025-12-04T11:11:09.5997245Z * [new branch] gh/IvanKobzarev/173/orig -> origin/gh/IvanKobzarev/173/orig 2025-12-04T11:11:09.5997447Z * [new branch] gh/IvanKobzarev/174/base -> origin/gh/IvanKobzarev/174/base 2025-12-04T11:11:09.5997656Z * [new branch] gh/IvanKobzarev/174/head -> origin/gh/IvanKobzarev/174/head 2025-12-04T11:11:09.5997864Z * [new branch] gh/IvanKobzarev/174/orig -> origin/gh/IvanKobzarev/174/orig 2025-12-04T11:11:09.5998104Z * [new branch] gh/IvanKobzarev/175/base -> origin/gh/IvanKobzarev/175/base 2025-12-04T11:11:09.5998356Z * [new branch] gh/IvanKobzarev/175/head -> origin/gh/IvanKobzarev/175/head 2025-12-04T11:11:09.5998603Z * [new branch] gh/IvanKobzarev/175/orig -> origin/gh/IvanKobzarev/175/orig 2025-12-04T11:11:09.5998806Z * [new branch] gh/IvanKobzarev/176/base -> origin/gh/IvanKobzarev/176/base 2025-12-04T11:11:09.5999014Z * [new branch] gh/IvanKobzarev/176/head -> origin/gh/IvanKobzarev/176/head 2025-12-04T11:11:09.5999220Z * [new branch] gh/IvanKobzarev/176/orig -> origin/gh/IvanKobzarev/176/orig 2025-12-04T11:11:09.5999430Z * [new branch] gh/IvanKobzarev/177/base -> origin/gh/IvanKobzarev/177/base 2025-12-04T11:11:09.5999639Z * [new branch] gh/IvanKobzarev/177/head -> origin/gh/IvanKobzarev/177/head 2025-12-04T11:11:09.5999847Z * [new branch] gh/IvanKobzarev/177/orig -> origin/gh/IvanKobzarev/177/orig 2025-12-04T11:11:09.6000056Z * [new branch] gh/IvanKobzarev/178/base -> origin/gh/IvanKobzarev/178/base 2025-12-04T11:11:09.6000264Z * [new branch] gh/IvanKobzarev/178/head -> origin/gh/IvanKobzarev/178/head 2025-12-04T11:11:09.6000469Z * [new branch] gh/IvanKobzarev/178/orig -> origin/gh/IvanKobzarev/178/orig 2025-12-04T11:11:09.6000673Z * [new branch] gh/IvanKobzarev/179/base -> origin/gh/IvanKobzarev/179/base 2025-12-04T11:11:09.6000879Z * [new branch] gh/IvanKobzarev/179/head -> origin/gh/IvanKobzarev/179/head 2025-12-04T11:11:09.6001079Z * [new branch] gh/IvanKobzarev/179/orig -> origin/gh/IvanKobzarev/179/orig 2025-12-04T11:11:09.6001282Z * [new branch] gh/IvanKobzarev/180/base -> origin/gh/IvanKobzarev/180/base 2025-12-04T11:11:09.6001485Z * [new branch] gh/IvanKobzarev/180/head -> origin/gh/IvanKobzarev/180/head 2025-12-04T11:11:09.6001689Z * [new branch] gh/IvanKobzarev/180/orig -> origin/gh/IvanKobzarev/180/orig 2025-12-04T11:11:09.6001892Z * [new branch] gh/IvanKobzarev/181/base -> origin/gh/IvanKobzarev/181/base 2025-12-04T11:11:09.6002102Z * [new branch] gh/IvanKobzarev/181/head -> origin/gh/IvanKobzarev/181/head 2025-12-04T11:11:09.6002301Z * [new branch] gh/IvanKobzarev/181/orig -> origin/gh/IvanKobzarev/181/orig 2025-12-04T11:11:09.6002506Z * [new branch] gh/IvanKobzarev/182/base -> origin/gh/IvanKobzarev/182/base 2025-12-04T11:11:09.6002710Z * [new branch] gh/IvanKobzarev/182/head -> origin/gh/IvanKobzarev/182/head 2025-12-04T11:11:09.6002911Z * [new branch] gh/IvanKobzarev/182/orig -> origin/gh/IvanKobzarev/182/orig 2025-12-04T11:11:09.6003115Z * [new branch] gh/IvanKobzarev/183/base -> origin/gh/IvanKobzarev/183/base 2025-12-04T11:11:09.6003321Z * [new branch] gh/IvanKobzarev/183/head -> origin/gh/IvanKobzarev/183/head 2025-12-04T11:11:09.6003524Z * [new branch] gh/IvanKobzarev/183/orig -> origin/gh/IvanKobzarev/183/orig 2025-12-04T11:11:09.6003728Z * [new branch] gh/IvanKobzarev/184/base -> origin/gh/IvanKobzarev/184/base 2025-12-04T11:11:09.6003935Z * [new branch] gh/IvanKobzarev/184/head -> origin/gh/IvanKobzarev/184/head 2025-12-04T11:11:09.6004135Z * [new branch] gh/IvanKobzarev/184/orig -> origin/gh/IvanKobzarev/184/orig 2025-12-04T11:11:09.6004344Z * [new branch] gh/NikhilAPatel/1/base -> origin/gh/NikhilAPatel/1/base 2025-12-04T11:11:09.6004549Z * [new branch] gh/NikhilAPatel/1/head -> origin/gh/NikhilAPatel/1/head 2025-12-04T11:11:09.6004747Z * [new branch] gh/NikhilAPatel/2/base -> origin/gh/NikhilAPatel/2/base 2025-12-04T11:11:09.6004946Z * [new branch] gh/NikhilAPatel/2/head -> origin/gh/NikhilAPatel/2/head 2025-12-04T11:11:09.6005176Z * [new branch] gh/NikhilAPatel/4/base -> origin/gh/NikhilAPatel/4/base 2025-12-04T11:11:09.6005377Z * [new branch] gh/NikhilAPatel/4/head -> origin/gh/NikhilAPatel/4/head 2025-12-04T11:11:09.6005576Z * [new branch] gh/NikhilAPatel/5/base -> origin/gh/NikhilAPatel/5/base 2025-12-04T11:11:09.6005796Z * [new branch] gh/NikhilAPatel/5/head -> origin/gh/NikhilAPatel/5/head 2025-12-04T11:11:09.6005995Z * [new branch] gh/NikhilAPatel/5/orig -> origin/gh/NikhilAPatel/5/orig 2025-12-04T11:11:09.6006190Z * [new branch] gh/PaliC/17/base -> origin/gh/PaliC/17/base 2025-12-04T11:11:09.6006372Z * [new branch] gh/PaliC/17/head -> origin/gh/PaliC/17/head 2025-12-04T11:11:09.6006553Z * [new branch] gh/PaliC/17/orig -> origin/gh/PaliC/17/orig 2025-12-04T11:11:09.6006734Z * [new branch] gh/PaliC/18/base -> origin/gh/PaliC/18/base 2025-12-04T11:11:09.6006914Z * [new branch] gh/PaliC/18/head -> origin/gh/PaliC/18/head 2025-12-04T11:11:09.6007095Z * [new branch] gh/PaliC/18/orig -> origin/gh/PaliC/18/orig 2025-12-04T11:11:09.6007273Z * [new branch] gh/PaliC/20/base -> origin/gh/PaliC/20/base 2025-12-04T11:11:09.6007453Z * [new branch] gh/PaliC/20/head -> origin/gh/PaliC/20/head 2025-12-04T11:11:09.6007633Z * [new branch] gh/PaliC/20/orig -> origin/gh/PaliC/20/orig 2025-12-04T11:11:09.6007811Z * [new branch] gh/PaliC/21/base -> origin/gh/PaliC/21/base 2025-12-04T11:11:09.6007985Z * [new branch] gh/PaliC/21/head -> origin/gh/PaliC/21/head 2025-12-04T11:11:09.6008192Z * [new branch] gh/PaliC/21/orig -> origin/gh/PaliC/21/orig 2025-12-04T11:11:09.6008371Z * [new branch] gh/PaliC/23/base -> origin/gh/PaliC/23/base 2025-12-04T11:11:09.6008549Z * [new branch] gh/PaliC/23/head -> origin/gh/PaliC/23/head 2025-12-04T11:11:09.6008727Z * [new branch] gh/PaliC/23/orig -> origin/gh/PaliC/23/orig 2025-12-04T11:11:09.6008907Z * [new branch] gh/PaliC/24/base -> origin/gh/PaliC/24/base 2025-12-04T11:11:09.6009085Z * [new branch] gh/PaliC/24/head -> origin/gh/PaliC/24/head 2025-12-04T11:11:09.6009264Z * [new branch] gh/PaliC/24/orig -> origin/gh/PaliC/24/orig 2025-12-04T11:11:09.6009439Z * [new branch] gh/PaliC/25/head -> origin/gh/PaliC/25/head 2025-12-04T11:11:09.6009617Z * [new branch] gh/PaliC/25/next -> origin/gh/PaliC/25/next 2025-12-04T11:11:09.6009794Z * [new branch] gh/PaliC/25/orig -> origin/gh/PaliC/25/orig 2025-12-04T11:11:09.6009967Z * [new branch] gh/PaliC/26/head -> origin/gh/PaliC/26/head 2025-12-04T11:11:09.6010146Z * [new branch] gh/PaliC/26/next -> origin/gh/PaliC/26/next 2025-12-04T11:11:09.6010325Z * [new branch] gh/PaliC/26/orig -> origin/gh/PaliC/26/orig 2025-12-04T11:11:09.6010498Z * [new branch] gh/PaliC/27/next -> origin/gh/PaliC/27/next 2025-12-04T11:11:09.6010676Z * [new branch] gh/PaliC/28/head -> origin/gh/PaliC/28/head 2025-12-04T11:11:09.6010854Z * [new branch] gh/PaliC/28/next -> origin/gh/PaliC/28/next 2025-12-04T11:11:09.6011026Z * [new branch] gh/PaliC/28/orig -> origin/gh/PaliC/28/orig 2025-12-04T11:11:09.6011203Z * [new branch] gh/PaliC/29/head -> origin/gh/PaliC/29/head 2025-12-04T11:11:09.6011383Z * [new branch] gh/PaliC/29/next -> origin/gh/PaliC/29/next 2025-12-04T11:11:09.6011557Z * [new branch] gh/PaliC/29/orig -> origin/gh/PaliC/29/orig 2025-12-04T11:11:09.6011735Z * [new branch] gh/PaliC/30/head -> origin/gh/PaliC/30/head 2025-12-04T11:11:09.6011953Z * [new branch] gh/PaliC/30/next -> origin/gh/PaliC/30/next 2025-12-04T11:11:09.6012128Z * [new branch] gh/PaliC/30/orig -> origin/gh/PaliC/30/orig 2025-12-04T11:11:09.6012339Z * [new branch] gh/PaliC/31/head -> origin/gh/PaliC/31/head 2025-12-04T11:11:09.6012513Z * [new branch] gh/PaliC/31/next -> origin/gh/PaliC/31/next 2025-12-04T11:11:09.6012694Z * [new branch] gh/PaliC/31/orig -> origin/gh/PaliC/31/orig 2025-12-04T11:11:09.6012885Z * [new branch] gh/PaulZhang12/25/base -> origin/gh/PaulZhang12/25/base 2025-12-04T11:11:09.6013080Z * [new branch] gh/PaulZhang12/25/head -> origin/gh/PaulZhang12/25/head 2025-12-04T11:11:09.6013279Z * [new branch] gh/PaulZhang12/25/orig -> origin/gh/PaulZhang12/25/orig 2025-12-04T11:11:09.6013478Z * [new branch] gh/PaulZhang12/28/base -> origin/gh/PaulZhang12/28/base 2025-12-04T11:11:09.6013674Z * [new branch] gh/PaulZhang12/28/head -> origin/gh/PaulZhang12/28/head 2025-12-04T11:11:09.6013873Z * [new branch] gh/PaulZhang12/28/orig -> origin/gh/PaulZhang12/28/orig 2025-12-04T11:11:09.6014075Z * [new branch] gh/PaulZhang12/31/base -> origin/gh/PaulZhang12/31/base 2025-12-04T11:11:09.6014266Z * [new branch] gh/PaulZhang12/31/head -> origin/gh/PaulZhang12/31/head 2025-12-04T11:11:09.6014463Z * [new branch] gh/PaulZhang12/31/orig -> origin/gh/PaulZhang12/31/orig 2025-12-04T11:11:09.6014660Z * [new branch] gh/PaulZhang12/37/base -> origin/gh/PaulZhang12/37/base 2025-12-04T11:11:09.6014852Z * [new branch] gh/PaulZhang12/37/head -> origin/gh/PaulZhang12/37/head 2025-12-04T11:11:09.6015047Z * [new branch] gh/PaulZhang12/37/orig -> origin/gh/PaulZhang12/37/orig 2025-12-04T11:11:09.6015243Z * [new branch] gh/PaulZhang12/40/base -> origin/gh/PaulZhang12/40/base 2025-12-04T11:11:09.6015440Z * [new branch] gh/PaulZhang12/40/head -> origin/gh/PaulZhang12/40/head 2025-12-04T11:11:09.6015636Z * [new branch] gh/PaulZhang12/40/orig -> origin/gh/PaulZhang12/40/orig 2025-12-04T11:11:09.6015835Z * [new branch] gh/PaulZhang12/42/base -> origin/gh/PaulZhang12/42/base 2025-12-04T11:11:09.6016028Z * [new branch] gh/PaulZhang12/42/head -> origin/gh/PaulZhang12/42/head 2025-12-04T11:11:09.6016227Z * [new branch] gh/PaulZhang12/43/base -> origin/gh/PaulZhang12/43/base 2025-12-04T11:11:09.6016425Z * [new branch] gh/PaulZhang12/43/head -> origin/gh/PaulZhang12/43/head 2025-12-04T11:11:09.6016617Z * [new branch] gh/PaulZhang12/43/orig -> origin/gh/PaulZhang12/43/orig 2025-12-04T11:11:09.6016813Z * [new branch] gh/PaulZhang12/44/base -> origin/gh/PaulZhang12/44/base 2025-12-04T11:11:09.6017012Z * [new branch] gh/PaulZhang12/44/head -> origin/gh/PaulZhang12/44/head 2025-12-04T11:11:09.6017204Z * [new branch] gh/PaulZhang12/45/base -> origin/gh/PaulZhang12/45/base 2025-12-04T11:11:09.6017400Z * [new branch] gh/PaulZhang12/45/head -> origin/gh/PaulZhang12/45/head 2025-12-04T11:11:09.6017597Z * [new branch] gh/PaulZhang12/45/orig -> origin/gh/PaulZhang12/45/orig 2025-12-04T11:11:09.6017788Z * [new branch] gh/PaulZhang12/46/base -> origin/gh/PaulZhang12/46/base 2025-12-04T11:11:09.6017981Z * [new branch] gh/PaulZhang12/46/head -> origin/gh/PaulZhang12/46/head 2025-12-04T11:11:09.6018313Z * [new branch] gh/PaulZhang12/46/orig -> origin/gh/PaulZhang12/46/orig 2025-12-04T11:11:09.6018503Z * [new branch] gh/PaulZhang12/47/base -> origin/gh/PaulZhang12/47/base 2025-12-04T11:11:09.6018696Z * [new branch] gh/PaulZhang12/47/head -> origin/gh/PaulZhang12/47/head 2025-12-04T11:11:09.6018926Z * [new branch] gh/PaulZhang12/47/orig -> origin/gh/PaulZhang12/47/orig 2025-12-04T11:11:09.6019117Z * [new branch] gh/PaulZhang12/48/base -> origin/gh/PaulZhang12/48/base 2025-12-04T11:11:09.6019313Z * [new branch] gh/PaulZhang12/48/head -> origin/gh/PaulZhang12/48/head 2025-12-04T11:11:09.6019544Z * [new branch] gh/PaulZhang12/48/orig -> origin/gh/PaulZhang12/48/orig 2025-12-04T11:11:09.6019732Z * [new branch] gh/SamGinzburg/11/base -> origin/gh/SamGinzburg/11/base 2025-12-04T11:11:09.6019926Z * [new branch] gh/SamGinzburg/11/head -> origin/gh/SamGinzburg/11/head 2025-12-04T11:11:09.6020128Z * [new branch] gh/SherlockNoMad/1/base -> origin/gh/SherlockNoMad/1/base 2025-12-04T11:11:09.6020327Z * [new branch] gh/SherlockNoMad/1/head -> origin/gh/SherlockNoMad/1/head 2025-12-04T11:11:09.6020531Z * [new branch] gh/SherlockNoMad/10/base -> origin/gh/SherlockNoMad/10/base 2025-12-04T11:11:09.6020742Z * [new branch] gh/SherlockNoMad/10/head -> origin/gh/SherlockNoMad/10/head 2025-12-04T11:11:09.6020946Z * [new branch] gh/SherlockNoMad/10/orig -> origin/gh/SherlockNoMad/10/orig 2025-12-04T11:11:09.6021154Z * [new branch] gh/SherlockNoMad/11/base -> origin/gh/SherlockNoMad/11/base 2025-12-04T11:11:09.6021356Z * [new branch] gh/SherlockNoMad/11/head -> origin/gh/SherlockNoMad/11/head 2025-12-04T11:11:09.6021556Z * [new branch] gh/SherlockNoMad/11/orig -> origin/gh/SherlockNoMad/11/orig 2025-12-04T11:11:09.6021758Z * [new branch] gh/SherlockNoMad/12/base -> origin/gh/SherlockNoMad/12/base 2025-12-04T11:11:09.6021961Z * [new branch] gh/SherlockNoMad/12/head -> origin/gh/SherlockNoMad/12/head 2025-12-04T11:11:09.6022159Z * [new branch] gh/SherlockNoMad/12/orig -> origin/gh/SherlockNoMad/12/orig 2025-12-04T11:11:09.6022364Z * [new branch] gh/SherlockNoMad/15/base -> origin/gh/SherlockNoMad/15/base 2025-12-04T11:11:09.6022565Z * [new branch] gh/SherlockNoMad/15/head -> origin/gh/SherlockNoMad/15/head 2025-12-04T11:11:09.6022762Z * [new branch] gh/SherlockNoMad/15/orig -> origin/gh/SherlockNoMad/15/orig 2025-12-04T11:11:09.6022966Z * [new branch] gh/SherlockNoMad/17/base -> origin/gh/SherlockNoMad/17/base 2025-12-04T11:11:09.6023163Z * [new branch] gh/SherlockNoMad/17/head -> origin/gh/SherlockNoMad/17/head 2025-12-04T11:11:09.6023366Z * [new branch] gh/SherlockNoMad/17/orig -> origin/gh/SherlockNoMad/17/orig 2025-12-04T11:11:09.6023569Z * [new branch] gh/SherlockNoMad/18/base -> origin/gh/SherlockNoMad/18/base 2025-12-04T11:11:09.6023767Z * [new branch] gh/SherlockNoMad/18/head -> origin/gh/SherlockNoMad/18/head 2025-12-04T11:11:09.6023971Z * [new branch] gh/SherlockNoMad/18/orig -> origin/gh/SherlockNoMad/18/orig 2025-12-04T11:11:09.6024172Z * [new branch] gh/SherlockNoMad/19/base -> origin/gh/SherlockNoMad/19/base 2025-12-04T11:11:09.6024372Z * [new branch] gh/SherlockNoMad/19/head -> origin/gh/SherlockNoMad/19/head 2025-12-04T11:11:09.6024580Z * [new branch] gh/SherlockNoMad/19/orig -> origin/gh/SherlockNoMad/19/orig 2025-12-04T11:11:09.6024780Z * [new branch] gh/SherlockNoMad/2/base -> origin/gh/SherlockNoMad/2/base 2025-12-04T11:11:09.6024977Z * [new branch] gh/SherlockNoMad/2/head -> origin/gh/SherlockNoMad/2/head 2025-12-04T11:11:09.6025180Z * [new branch] gh/SherlockNoMad/20/base -> origin/gh/SherlockNoMad/20/base 2025-12-04T11:11:09.6025383Z * [new branch] gh/SherlockNoMad/20/head -> origin/gh/SherlockNoMad/20/head 2025-12-04T11:11:09.6025580Z * [new branch] gh/SherlockNoMad/20/orig -> origin/gh/SherlockNoMad/20/orig 2025-12-04T11:11:09.6025812Z * [new branch] gh/SherlockNoMad/21/base -> origin/gh/SherlockNoMad/21/base 2025-12-04T11:11:09.6026015Z * [new branch] gh/SherlockNoMad/21/head -> origin/gh/SherlockNoMad/21/head 2025-12-04T11:11:09.6026214Z * [new branch] gh/SherlockNoMad/21/orig -> origin/gh/SherlockNoMad/21/orig 2025-12-04T11:11:09.6026439Z * [new branch] gh/SherlockNoMad/3/base -> origin/gh/SherlockNoMad/3/base 2025-12-04T11:11:09.6026639Z * [new branch] gh/SherlockNoMad/3/head -> origin/gh/SherlockNoMad/3/head 2025-12-04T11:11:09.6026832Z * [new branch] gh/SherlockNoMad/4/base -> origin/gh/SherlockNoMad/4/base 2025-12-04T11:11:09.6027028Z * [new branch] gh/SherlockNoMad/4/head -> origin/gh/SherlockNoMad/4/head 2025-12-04T11:11:09.6027227Z * [new branch] gh/SherlockNoMad/5/base -> origin/gh/SherlockNoMad/5/base 2025-12-04T11:11:09.6027421Z * [new branch] gh/SherlockNoMad/5/head -> origin/gh/SherlockNoMad/5/head 2025-12-04T11:11:09.6027639Z * [new branch] gh/Sidharth123-cpu/24/base -> origin/gh/Sidharth123-cpu/24/base 2025-12-04T11:11:09.6027859Z * [new branch] gh/Sidharth123-cpu/25/base -> origin/gh/Sidharth123-cpu/25/base 2025-12-04T11:11:09.6028072Z * [new branch] gh/Sidharth123-cpu/26/base -> origin/gh/Sidharth123-cpu/26/base 2025-12-04T11:11:09.6028326Z * [new branch] gh/Sidharth123-cpu/27/base -> origin/gh/Sidharth123-cpu/27/base 2025-12-04T11:11:09.6028530Z * [new branch] gh/StrongerXi/1/base -> origin/gh/StrongerXi/1/base 2025-12-04T11:11:09.6028721Z * [new branch] gh/StrongerXi/1/head -> origin/gh/StrongerXi/1/head 2025-12-04T11:11:09.6028915Z * [new branch] gh/StrongerXi/71/base -> origin/gh/StrongerXi/71/base 2025-12-04T11:11:09.6029110Z * [new branch] gh/StrongerXi/71/head -> origin/gh/StrongerXi/71/head 2025-12-04T11:11:09.6029302Z * [new branch] gh/StrongerXi/72/base -> origin/gh/StrongerXi/72/base 2025-12-04T11:11:09.6029497Z * [new branch] gh/StrongerXi/72/head -> origin/gh/StrongerXi/72/head 2025-12-04T11:11:09.6029684Z * [new branch] gh/StrongerXi/73/base -> origin/gh/StrongerXi/73/base 2025-12-04T11:11:09.6029880Z * [new branch] gh/StrongerXi/73/head -> origin/gh/StrongerXi/73/head 2025-12-04T11:11:09.6030069Z * [new branch] gh/StrongerXi/73/orig -> origin/gh/StrongerXi/73/orig 2025-12-04T11:11:09.6030256Z * [new branch] gh/XilunWu/160/base -> origin/gh/XilunWu/160/base 2025-12-04T11:11:09.6030444Z * [new branch] gh/XilunWu/160/head -> origin/gh/XilunWu/160/head 2025-12-04T11:11:09.6030629Z * [new branch] gh/XilunWu/160/orig -> origin/gh/XilunWu/160/orig 2025-12-04T11:11:09.6030810Z * [new branch] gh/XilunWu/163/base -> origin/gh/XilunWu/163/base 2025-12-04T11:11:09.6030994Z * [new branch] gh/XilunWu/163/head -> origin/gh/XilunWu/163/head 2025-12-04T11:11:09.6031178Z * [new branch] gh/XilunWu/163/orig -> origin/gh/XilunWu/163/orig 2025-12-04T11:11:09.6031358Z * [new branch] gh/XilunWu/168/base -> origin/gh/XilunWu/168/base 2025-12-04T11:11:09.6031541Z * [new branch] gh/XilunWu/168/head -> origin/gh/XilunWu/168/head 2025-12-04T11:11:09.6031725Z * [new branch] gh/XilunWu/168/orig -> origin/gh/XilunWu/168/orig 2025-12-04T11:11:09.6031907Z * [new branch] gh/XilunWu/169/base -> origin/gh/XilunWu/169/base 2025-12-04T11:11:09.6032091Z * [new branch] gh/XilunWu/169/head -> origin/gh/XilunWu/169/head 2025-12-04T11:11:09.6032277Z * [new branch] gh/XilunWu/169/orig -> origin/gh/XilunWu/169/orig 2025-12-04T11:11:09.6032457Z * [new branch] gh/XilunWu/170/base -> origin/gh/XilunWu/170/base 2025-12-04T11:11:09.6032638Z * [new branch] gh/XilunWu/170/head -> origin/gh/XilunWu/170/head 2025-12-04T11:11:09.6032857Z * [new branch] gh/XilunWu/170/orig -> origin/gh/XilunWu/170/orig 2025-12-04T11:11:09.6033043Z * [new branch] gh/XilunWu/171/base -> origin/gh/XilunWu/171/base 2025-12-04T11:11:09.6033254Z * [new branch] gh/XilunWu/171/head -> origin/gh/XilunWu/171/head 2025-12-04T11:11:09.6033438Z * [new branch] gh/XilunWu/171/orig -> origin/gh/XilunWu/171/orig 2025-12-04T11:11:09.6033621Z * [new branch] gh/XilunWu/173/base -> origin/gh/XilunWu/173/base 2025-12-04T11:11:09.6033802Z * [new branch] gh/XilunWu/173/head -> origin/gh/XilunWu/173/head 2025-12-04T11:11:09.6033981Z * [new branch] gh/XilunWu/173/orig -> origin/gh/XilunWu/173/orig 2025-12-04T11:11:09.6034165Z * [new branch] gh/XilunWu/175/base -> origin/gh/XilunWu/175/base 2025-12-04T11:11:09.6034349Z * [new branch] gh/XilunWu/175/head -> origin/gh/XilunWu/175/head 2025-12-04T11:11:09.6034534Z * [new branch] gh/XilunWu/175/orig -> origin/gh/XilunWu/175/orig 2025-12-04T11:11:09.6034716Z * [new branch] gh/XilunWu/176/base -> origin/gh/XilunWu/176/base 2025-12-04T11:11:09.6034900Z * [new branch] gh/XilunWu/176/head -> origin/gh/XilunWu/176/head 2025-12-04T11:11:09.6035080Z * [new branch] gh/XilunWu/176/orig -> origin/gh/XilunWu/176/orig 2025-12-04T11:11:09.6035267Z * [new branch] gh/XuehaiPan/14/base -> origin/gh/XuehaiPan/14/base 2025-12-04T11:11:09.6035459Z * [new branch] gh/XuehaiPan/14/head -> origin/gh/XuehaiPan/14/head 2025-12-04T11:11:09.6035645Z * [new branch] gh/XuehaiPan/14/orig -> origin/gh/XuehaiPan/14/orig 2025-12-04T11:11:09.6035838Z * [new branch] gh/XuehaiPan/179/base -> origin/gh/XuehaiPan/179/base 2025-12-04T11:11:09.6036035Z * [new branch] gh/XuehaiPan/179/head -> origin/gh/XuehaiPan/179/head 2025-12-04T11:11:09.6036221Z * [new branch] gh/XuehaiPan/179/orig -> origin/gh/XuehaiPan/179/orig 2025-12-04T11:11:09.6036411Z * [new branch] gh/XuehaiPan/249/base -> origin/gh/XuehaiPan/249/base 2025-12-04T11:11:09.6036608Z * [new branch] gh/XuehaiPan/249/head -> origin/gh/XuehaiPan/249/head 2025-12-04T11:11:09.6036793Z * [new branch] gh/XuehaiPan/249/orig -> origin/gh/XuehaiPan/249/orig 2025-12-04T11:11:09.6036984Z * [new branch] gh/XuehaiPan/253/base -> origin/gh/XuehaiPan/253/base 2025-12-04T11:11:09.6037175Z * [new branch] gh/XuehaiPan/253/head -> origin/gh/XuehaiPan/253/head 2025-12-04T11:11:09.6037365Z * [new branch] gh/XuehaiPan/253/orig -> origin/gh/XuehaiPan/253/orig 2025-12-04T11:11:09.6037553Z * [new branch] gh/XuehaiPan/254/base -> origin/gh/XuehaiPan/254/base 2025-12-04T11:11:09.6037743Z * [new branch] gh/XuehaiPan/254/head -> origin/gh/XuehaiPan/254/head 2025-12-04T11:11:09.6037931Z * [new branch] gh/XuehaiPan/254/orig -> origin/gh/XuehaiPan/254/orig 2025-12-04T11:11:09.6038120Z * [new branch] gh/XuehaiPan/255/base -> origin/gh/XuehaiPan/255/base 2025-12-04T11:11:09.6038351Z * [new branch] gh/XuehaiPan/255/head -> origin/gh/XuehaiPan/255/head 2025-12-04T11:11:09.6038538Z * [new branch] gh/XuehaiPan/255/orig -> origin/gh/XuehaiPan/255/orig 2025-12-04T11:11:09.6038726Z * [new branch] gh/XuehaiPan/271/base -> origin/gh/XuehaiPan/271/base 2025-12-04T11:11:09.6038912Z * [new branch] gh/XuehaiPan/271/head -> origin/gh/XuehaiPan/271/head 2025-12-04T11:11:09.6039104Z * [new branch] gh/XuehaiPan/271/orig -> origin/gh/XuehaiPan/271/orig 2025-12-04T11:11:09.6039297Z * [new branch] gh/XuehaiPan/343/base -> origin/gh/XuehaiPan/343/base 2025-12-04T11:11:09.6039517Z * [new branch] gh/XuehaiPan/343/head -> origin/gh/XuehaiPan/343/head 2025-12-04T11:11:09.6039707Z * [new branch] gh/XuehaiPan/343/orig -> origin/gh/XuehaiPan/343/orig 2025-12-04T11:11:09.6039897Z * [new branch] gh/XuehaiPan/347/base -> origin/gh/XuehaiPan/347/base 2025-12-04T11:11:09.6040127Z * [new branch] gh/XuehaiPan/347/head -> origin/gh/XuehaiPan/347/head 2025-12-04T11:11:09.6040318Z * [new branch] gh/XuehaiPan/347/orig -> origin/gh/XuehaiPan/347/orig 2025-12-04T11:11:09.6040507Z * [new branch] gh/XuehaiPan/348/base -> origin/gh/XuehaiPan/348/base 2025-12-04T11:11:09.6040694Z * [new branch] gh/XuehaiPan/348/head -> origin/gh/XuehaiPan/348/head 2025-12-04T11:11:09.6040883Z * [new branch] gh/XuehaiPan/348/orig -> origin/gh/XuehaiPan/348/orig 2025-12-04T11:11:09.6041071Z * [new branch] gh/XuehaiPan/350/base -> origin/gh/XuehaiPan/350/base 2025-12-04T11:11:09.6041259Z * [new branch] gh/XuehaiPan/350/head -> origin/gh/XuehaiPan/350/head 2025-12-04T11:11:09.6041450Z * [new branch] gh/XuehaiPan/350/orig -> origin/gh/XuehaiPan/350/orig 2025-12-04T11:11:09.6041638Z * [new branch] gh/XuehaiPan/365/base -> origin/gh/XuehaiPan/365/base 2025-12-04T11:11:09.6041827Z * [new branch] gh/XuehaiPan/365/head -> origin/gh/XuehaiPan/365/head 2025-12-04T11:11:09.6042015Z * [new branch] gh/XuehaiPan/365/orig -> origin/gh/XuehaiPan/365/orig 2025-12-04T11:11:09.6042202Z * [new branch] gh/XuehaiPan/366/base -> origin/gh/XuehaiPan/366/base 2025-12-04T11:11:09.6042391Z * [new branch] gh/XuehaiPan/366/head -> origin/gh/XuehaiPan/366/head 2025-12-04T11:11:09.6042581Z * [new branch] gh/XuehaiPan/370/base -> origin/gh/XuehaiPan/370/base 2025-12-04T11:11:09.6042768Z * [new branch] gh/XuehaiPan/370/head -> origin/gh/XuehaiPan/370/head 2025-12-04T11:11:09.6042960Z * [new branch] gh/XuehaiPan/370/orig -> origin/gh/XuehaiPan/370/orig 2025-12-04T11:11:09.6043151Z * [new branch] gh/XuehaiPan/390/base -> origin/gh/XuehaiPan/390/base 2025-12-04T11:11:09.6043343Z * [new branch] gh/XuehaiPan/390/head -> origin/gh/XuehaiPan/390/head 2025-12-04T11:11:09.6043533Z * [new branch] gh/XuehaiPan/390/orig -> origin/gh/XuehaiPan/390/orig 2025-12-04T11:11:09.6043722Z * [new branch] gh/XuehaiPan/391/base -> origin/gh/XuehaiPan/391/base 2025-12-04T11:11:09.6043908Z * [new branch] gh/XuehaiPan/391/head -> origin/gh/XuehaiPan/391/head 2025-12-04T11:11:09.6044096Z * [new branch] gh/XuehaiPan/391/orig -> origin/gh/XuehaiPan/391/orig 2025-12-04T11:11:09.6044286Z * [new branch] gh/XuehaiPan/392/base -> origin/gh/XuehaiPan/392/base 2025-12-04T11:11:09.6044471Z * [new branch] gh/XuehaiPan/392/head -> origin/gh/XuehaiPan/392/head 2025-12-04T11:11:09.6044666Z * [new branch] gh/XuehaiPan/392/orig -> origin/gh/XuehaiPan/392/orig 2025-12-04T11:11:09.6044857Z * [new branch] gh/XuehaiPan/394/base -> origin/gh/XuehaiPan/394/base 2025-12-04T11:11:09.6045046Z * [new branch] gh/XuehaiPan/394/head -> origin/gh/XuehaiPan/394/head 2025-12-04T11:11:09.6045234Z * [new branch] gh/XuehaiPan/394/orig -> origin/gh/XuehaiPan/394/orig 2025-12-04T11:11:09.6045425Z * [new branch] gh/XuehaiPan/397/base -> origin/gh/XuehaiPan/397/base 2025-12-04T11:11:09.6045611Z * [new branch] gh/XuehaiPan/397/head -> origin/gh/XuehaiPan/397/head 2025-12-04T11:11:09.6045807Z * [new branch] gh/XuehaiPan/397/orig -> origin/gh/XuehaiPan/397/orig 2025-12-04T11:11:09.6045998Z * [new branch] gh/XuehaiPan/398/base -> origin/gh/XuehaiPan/398/base 2025-12-04T11:11:09.6046213Z * [new branch] gh/XuehaiPan/398/head -> origin/gh/XuehaiPan/398/head 2025-12-04T11:11:09.6046405Z * [new branch] gh/XuehaiPan/398/orig -> origin/gh/XuehaiPan/398/orig 2025-12-04T11:11:09.6046591Z * [new branch] gh/XuehaiPan/399/base -> origin/gh/XuehaiPan/399/base 2025-12-04T11:11:09.6046812Z * [new branch] gh/XuehaiPan/399/head -> origin/gh/XuehaiPan/399/head 2025-12-04T11:11:09.6047001Z * [new branch] gh/XuehaiPan/399/orig -> origin/gh/XuehaiPan/399/orig 2025-12-04T11:11:09.6047188Z * [new branch] gh/XuehaiPan/400/base -> origin/gh/XuehaiPan/400/base 2025-12-04T11:11:09.6047376Z * [new branch] gh/XuehaiPan/400/head -> origin/gh/XuehaiPan/400/head 2025-12-04T11:11:09.6047566Z * [new branch] gh/XuehaiPan/400/orig -> origin/gh/XuehaiPan/400/orig 2025-12-04T11:11:09.6047762Z * [new branch] gh/ZhiweiYan-96/39/base -> origin/gh/ZhiweiYan-96/39/base 2025-12-04T11:11:09.6047964Z * [new branch] gh/ZhiweiYan-96/39/head -> origin/gh/ZhiweiYan-96/39/head 2025-12-04T11:11:09.6048201Z * [new branch] gh/ZhiweiYan-96/39/orig -> origin/gh/ZhiweiYan-96/39/orig 2025-12-04T11:11:09.6048392Z * [new branch] gh/ZhiweiYan-96/44/base -> origin/gh/ZhiweiYan-96/44/base 2025-12-04T11:11:09.6048591Z * [new branch] gh/ZhiweiYan-96/44/head -> origin/gh/ZhiweiYan-96/44/head 2025-12-04T11:11:09.6048785Z * [new branch] gh/ZhiweiYan-96/45/base -> origin/gh/ZhiweiYan-96/45/base 2025-12-04T11:11:09.6048975Z * [new branch] gh/ZhiweiYan-96/45/head -> origin/gh/ZhiweiYan-96/45/head 2025-12-04T11:11:09.6049166Z * [new branch] gh/ZhiweiYan-96/49/base -> origin/gh/ZhiweiYan-96/49/base 2025-12-04T11:11:09.6049358Z * [new branch] gh/ZhiweiYan-96/49/head -> origin/gh/ZhiweiYan-96/49/head 2025-12-04T11:11:09.6049547Z * [new branch] gh/ZhiweiYan-96/62/base -> origin/gh/ZhiweiYan-96/62/base 2025-12-04T11:11:09.6049741Z * [new branch] gh/ZhiweiYan-96/62/head -> origin/gh/ZhiweiYan-96/62/head 2025-12-04T11:11:09.6049936Z * [new branch] gh/ZhiweiYan-96/66/base -> origin/gh/ZhiweiYan-96/66/base 2025-12-04T11:11:09.6050128Z * [new branch] gh/ZhiweiYan-96/66/head -> origin/gh/ZhiweiYan-96/66/head 2025-12-04T11:11:09.6050321Z * [new branch] gh/ZhiweiYan-96/67/base -> origin/gh/ZhiweiYan-96/67/base 2025-12-04T11:11:09.6050514Z * [new branch] gh/ZhiweiYan-96/67/head -> origin/gh/ZhiweiYan-96/67/head 2025-12-04T11:11:09.6050705Z * [new branch] gh/ZhiweiYan-96/68/base -> origin/gh/ZhiweiYan-96/68/base 2025-12-04T11:11:09.6050898Z * [new branch] gh/ZhiweiYan-96/68/head -> origin/gh/ZhiweiYan-96/68/head 2025-12-04T11:11:09.6051087Z * [new branch] gh/ZhiweiYan-96/68/orig -> origin/gh/ZhiweiYan-96/68/orig 2025-12-04T11:11:09.6051278Z * [new branch] gh/aakhundov/1/base -> origin/gh/aakhundov/1/base 2025-12-04T11:11:09.6051469Z * [new branch] gh/aakhundov/1/head -> origin/gh/aakhundov/1/head 2025-12-04T11:11:09.6051652Z * [new branch] gh/aakhundov/2/base -> origin/gh/aakhundov/2/base 2025-12-04T11:11:09.6051840Z * [new branch] gh/aakhundov/2/head -> origin/gh/aakhundov/2/head 2025-12-04T11:11:09.6052029Z * [new branch] gh/aditew01/openblas -> origin/gh/aditew01/openblas 2025-12-04T11:11:09.6052217Z * [new branch] gh/aditew01/sbgemm -> origin/gh/aditew01/sbgemm 2025-12-04T11:11:09.6052404Z * [new branch] gh/aditew01/vecbf16 -> origin/gh/aditew01/vecbf16 2025-12-04T11:11:09.6052585Z * [new branch] gh/albanD/4/base -> origin/gh/albanD/4/base 2025-12-04T11:11:09.6052764Z * [new branch] gh/albanD/4/head -> origin/gh/albanD/4/head 2025-12-04T11:11:09.6052944Z * [new branch] gh/albanD/4/orig -> origin/gh/albanD/4/orig 2025-12-04T11:11:09.6053249Z * [new branch] gh/alexbrauckmann/paddedtensor_faketensor_init -> origin/gh/alexbrauckmann/paddedtensor_faketensor_init 2025-12-04T11:11:09.6053526Z * [new branch] gh/alexsamardzic/12/base -> origin/gh/alexsamardzic/12/base 2025-12-04T11:11:09.6053770Z * [new branch] gh/alexsamardzic/12/head -> origin/gh/alexsamardzic/12/head 2025-12-04T11:11:09.6053981Z * [new branch] gh/alexsamardzic/12/orig -> origin/gh/alexsamardzic/12/orig 2025-12-04T11:11:09.6054183Z * [new branch] gh/alexsamardzic/14/base -> origin/gh/alexsamardzic/14/base 2025-12-04T11:11:09.6054388Z * [new branch] gh/alexsamardzic/14/head -> origin/gh/alexsamardzic/14/head 2025-12-04T11:11:09.6054595Z * [new branch] gh/alexsamardzic/14/orig -> origin/gh/alexsamardzic/14/orig 2025-12-04T11:11:09.6054797Z * [new branch] gh/alexsamardzic/15/base -> origin/gh/alexsamardzic/15/base 2025-12-04T11:11:09.6055004Z * [new branch] gh/alexsamardzic/15/head -> origin/gh/alexsamardzic/15/head 2025-12-04T11:11:09.6055210Z * [new branch] gh/alexsamardzic/15/orig -> origin/gh/alexsamardzic/15/orig 2025-12-04T11:11:09.6055410Z * [new branch] gh/amjames/18/base -> origin/gh/amjames/18/base 2025-12-04T11:11:09.6055600Z * [new branch] gh/amjames/18/head -> origin/gh/amjames/18/head 2025-12-04T11:11:09.6055786Z * [new branch] gh/amjames/18/orig -> origin/gh/amjames/18/orig 2025-12-04T11:11:09.6055976Z * [new branch] gh/andrewor14/35/base -> origin/gh/andrewor14/35/base 2025-12-04T11:11:09.6056173Z * [new branch] gh/andrewor14/35/head -> origin/gh/andrewor14/35/head 2025-12-04T11:11:09.6056362Z * [new branch] gh/andrewor14/35/orig -> origin/gh/andrewor14/35/orig 2025-12-04T11:11:09.6056555Z * [new branch] gh/andrewor14/50/base -> origin/gh/andrewor14/50/base 2025-12-04T11:11:09.6056749Z * [new branch] gh/andrewor14/50/head -> origin/gh/andrewor14/50/head 2025-12-04T11:11:09.6056938Z * [new branch] gh/andrewor14/50/orig -> origin/gh/andrewor14/50/orig 2025-12-04T11:11:09.6057136Z * [new branch] gh/andyanwang/30/base -> origin/gh/andyanwang/30/base 2025-12-04T11:11:09.6057330Z * [new branch] gh/andyanwang/30/orig -> origin/gh/andyanwang/30/orig 2025-12-04T11:11:09.6057519Z * [new branch] gh/andyanwang/31/base -> origin/gh/andyanwang/31/base 2025-12-04T11:11:09.6057717Z * [new branch] gh/andyanwang/31/orig -> origin/gh/andyanwang/31/orig 2025-12-04T11:11:09.6057911Z * [new branch] gh/andyanwang/39/base -> origin/gh/andyanwang/39/base 2025-12-04T11:11:09.6058101Z * [new branch] gh/andyanwang/39/head -> origin/gh/andyanwang/39/head 2025-12-04T11:11:09.6058331Z * [new branch] gh/andyanwang/39/orig -> origin/gh/andyanwang/39/orig 2025-12-04T11:11:09.6058527Z * [new branch] gh/andyanwang/42/base -> origin/gh/andyanwang/42/base 2025-12-04T11:11:09.6058717Z * [new branch] gh/andyanwang/42/head -> origin/gh/andyanwang/42/head 2025-12-04T11:11:09.6058916Z * [new branch] gh/andyanwang/42/orig -> origin/gh/andyanwang/42/orig 2025-12-04T11:11:09.6059110Z * [new branch] gh/andyanwang/45/base -> origin/gh/andyanwang/45/base 2025-12-04T11:11:09.6059299Z * [new branch] gh/andyanwang/45/head -> origin/gh/andyanwang/45/head 2025-12-04T11:11:09.6059493Z * [new branch] gh/andyanwang/45/orig -> origin/gh/andyanwang/45/orig 2025-12-04T11:11:09.6059686Z * [new branch] gh/angelayi/107/base -> origin/gh/angelayi/107/base 2025-12-04T11:11:09.6059874Z * [new branch] gh/angelayi/107/head -> origin/gh/angelayi/107/head 2025-12-04T11:11:09.6060101Z * [new branch] gh/angelayi/114/base -> origin/gh/angelayi/114/base 2025-12-04T11:11:09.6060294Z * [new branch] gh/angelayi/114/head -> origin/gh/angelayi/114/head 2025-12-04T11:11:09.6060480Z * [new branch] gh/angelayi/114/orig -> origin/gh/angelayi/114/orig 2025-12-04T11:11:09.6060701Z * [new branch] gh/angelayi/116/base -> origin/gh/angelayi/116/base 2025-12-04T11:11:09.6060891Z * [new branch] gh/angelayi/116/head -> origin/gh/angelayi/116/head 2025-12-04T11:11:09.6061075Z * [new branch] gh/angelayi/116/orig -> origin/gh/angelayi/116/orig 2025-12-04T11:11:09.6061268Z * [new branch] gh/angelayi/122/base -> origin/gh/angelayi/122/base 2025-12-04T11:11:09.6061454Z * [new branch] gh/angelayi/122/head -> origin/gh/angelayi/122/head 2025-12-04T11:11:09.6061647Z * [new branch] gh/angelayi/122/orig -> origin/gh/angelayi/122/orig 2025-12-04T11:11:09.6061842Z * [new branch] gh/angelayi/124/base -> origin/gh/angelayi/124/base 2025-12-04T11:11:09.6062030Z * [new branch] gh/angelayi/124/head -> origin/gh/angelayi/124/head 2025-12-04T11:11:09.6062223Z * [new branch] gh/angelayi/124/orig -> origin/gh/angelayi/124/orig 2025-12-04T11:11:09.6062423Z * [new branch] gh/angelayi/128/base -> origin/gh/angelayi/128/base 2025-12-04T11:11:09.6062609Z * [new branch] gh/angelayi/128/head -> origin/gh/angelayi/128/head 2025-12-04T11:11:09.6062800Z * [new branch] gh/angelayi/128/orig -> origin/gh/angelayi/128/orig 2025-12-04T11:11:09.6062992Z * [new branch] gh/angelayi/131/base -> origin/gh/angelayi/131/base 2025-12-04T11:11:09.6063179Z * [new branch] gh/angelayi/131/head -> origin/gh/angelayi/131/head 2025-12-04T11:11:09.6063371Z * [new branch] gh/angelayi/131/orig -> origin/gh/angelayi/131/orig 2025-12-04T11:11:09.6063566Z * [new branch] gh/angelayi/132/base -> origin/gh/angelayi/132/base 2025-12-04T11:11:09.6063753Z * [new branch] gh/angelayi/132/head -> origin/gh/angelayi/132/head 2025-12-04T11:11:09.6063946Z * [new branch] gh/angelayi/132/orig -> origin/gh/angelayi/132/orig 2025-12-04T11:11:09.6064147Z * [new branch] gh/angelayi/133/base -> origin/gh/angelayi/133/base 2025-12-04T11:11:09.6064335Z * [new branch] gh/angelayi/133/head -> origin/gh/angelayi/133/head 2025-12-04T11:11:09.6064526Z * [new branch] gh/angelayi/133/orig -> origin/gh/angelayi/133/orig 2025-12-04T11:11:09.6064719Z * [new branch] gh/angelayi/134/base -> origin/gh/angelayi/134/base 2025-12-04T11:11:09.6064906Z * [new branch] gh/angelayi/134/head -> origin/gh/angelayi/134/head 2025-12-04T11:11:09.6065099Z * [new branch] gh/angelayi/134/orig -> origin/gh/angelayi/134/orig 2025-12-04T11:11:09.6065296Z * [new branch] gh/angelayi/135/base -> origin/gh/angelayi/135/base 2025-12-04T11:11:09.6065484Z * [new branch] gh/angelayi/135/head -> origin/gh/angelayi/135/head 2025-12-04T11:11:09.6065675Z * [new branch] gh/angelayi/135/orig -> origin/gh/angelayi/135/orig 2025-12-04T11:11:09.6065864Z * [new branch] gh/angelayi/136/base -> origin/gh/angelayi/136/base 2025-12-04T11:11:09.6066057Z * [new branch] gh/angelayi/136/head -> origin/gh/angelayi/136/head 2025-12-04T11:11:09.6066250Z * [new branch] gh/angelayi/136/orig -> origin/gh/angelayi/136/orig 2025-12-04T11:11:09.6066437Z * [new branch] gh/angelayi/137/base -> origin/gh/angelayi/137/base 2025-12-04T11:11:09.6066630Z * [new branch] gh/angelayi/137/head -> origin/gh/angelayi/137/head 2025-12-04T11:11:09.6066823Z * [new branch] gh/angelayi/137/orig -> origin/gh/angelayi/137/orig 2025-12-04T11:11:09.6067036Z * [new branch] gh/angelayi/138/base -> origin/gh/angelayi/138/base 2025-12-04T11:11:09.6067229Z * [new branch] gh/angelayi/138/head -> origin/gh/angelayi/138/head 2025-12-04T11:11:09.6067421Z * [new branch] gh/angelayi/138/orig -> origin/gh/angelayi/138/orig 2025-12-04T11:11:09.6067632Z * [new branch] gh/angelayi/139/base -> origin/gh/angelayi/139/base 2025-12-04T11:11:09.6067825Z * [new branch] gh/angelayi/139/head -> origin/gh/angelayi/139/head 2025-12-04T11:11:09.6068017Z * [new branch] gh/angelayi/139/orig -> origin/gh/angelayi/139/orig 2025-12-04T11:11:09.6068239Z * [new branch] gh/angelayi/140/base -> origin/gh/angelayi/140/base 2025-12-04T11:11:09.6068431Z * [new branch] gh/angelayi/140/head -> origin/gh/angelayi/140/head 2025-12-04T11:11:09.6068623Z * [new branch] gh/angelayi/140/orig -> origin/gh/angelayi/140/orig 2025-12-04T11:11:09.6068813Z * [new branch] gh/angelayi/141/base -> origin/gh/angelayi/141/base 2025-12-04T11:11:09.6069006Z * [new branch] gh/angelayi/141/head -> origin/gh/angelayi/141/head 2025-12-04T11:11:09.6069201Z * [new branch] gh/angelayi/141/orig -> origin/gh/angelayi/141/orig 2025-12-04T11:11:09.6069388Z * [new branch] gh/angelayi/142/base -> origin/gh/angelayi/142/base 2025-12-04T11:11:09.6069581Z * [new branch] gh/angelayi/142/head -> origin/gh/angelayi/142/head 2025-12-04T11:11:09.6069771Z * [new branch] gh/angelayi/142/orig -> origin/gh/angelayi/142/orig 2025-12-04T11:11:09.6069957Z * [new branch] gh/angelayi/143/base -> origin/gh/angelayi/143/base 2025-12-04T11:11:09.6070208Z * [new branch] gh/angelayi/143/head -> origin/gh/angelayi/143/head 2025-12-04T11:11:09.6070398Z * [new branch] gh/angelayi/143/orig -> origin/gh/angelayi/143/orig 2025-12-04T11:11:09.6070595Z * [new branch] gh/angelayi/144/base -> origin/gh/angelayi/144/base 2025-12-04T11:11:09.6070789Z * [new branch] gh/angelayi/144/head -> origin/gh/angelayi/144/head 2025-12-04T11:11:09.6070981Z * [new branch] gh/angelayi/144/orig -> origin/gh/angelayi/144/orig 2025-12-04T11:11:09.6071182Z * [new branch] gh/anijain2305/753/base -> origin/gh/anijain2305/753/base 2025-12-04T11:11:09.6071387Z * [new branch] gh/anijain2305/753/head -> origin/gh/anijain2305/753/head 2025-12-04T11:11:09.6071582Z * [new branch] gh/anijain2305/753/orig -> origin/gh/anijain2305/753/orig 2025-12-04T11:11:09.6071784Z * [new branch] gh/anijain2305/810/base -> origin/gh/anijain2305/810/base 2025-12-04T11:11:09.6071984Z * [new branch] gh/anijain2305/810/head -> origin/gh/anijain2305/810/head 2025-12-04T11:11:09.6072174Z * [new branch] gh/anijain2305/810/orig -> origin/gh/anijain2305/810/orig 2025-12-04T11:11:09.6072376Z * [new branch] gh/anijain2305/854/base -> origin/gh/anijain2305/854/base 2025-12-04T11:11:09.6072574Z * [new branch] gh/anijain2305/854/head -> origin/gh/anijain2305/854/head 2025-12-04T11:11:09.6072770Z * [new branch] gh/anijain2305/854/orig -> origin/gh/anijain2305/854/orig 2025-12-04T11:11:09.6072969Z * [new branch] gh/anijain2305/864/base -> origin/gh/anijain2305/864/base 2025-12-04T11:11:09.6073167Z * [new branch] gh/anijain2305/864/head -> origin/gh/anijain2305/864/head 2025-12-04T11:11:09.6073361Z * [new branch] gh/anijain2305/864/orig -> origin/gh/anijain2305/864/orig 2025-12-04T11:11:09.6073557Z * [new branch] gh/anijain2305/870/base -> origin/gh/anijain2305/870/base 2025-12-04T11:11:09.6073748Z * [new branch] gh/anijain2305/870/head -> origin/gh/anijain2305/870/head 2025-12-04T11:11:09.6073980Z * [new branch] gh/anijain2305/870/orig -> origin/gh/anijain2305/870/orig 2025-12-04T11:11:09.6074176Z * [new branch] gh/anijain2305/873/base -> origin/gh/anijain2305/873/base 2025-12-04T11:11:09.6074368Z * [new branch] gh/anijain2305/873/head -> origin/gh/anijain2305/873/head 2025-12-04T11:11:09.6074706Z * [new branch] gh/anijain2305/873/orig -> origin/gh/anijain2305/873/orig 2025-12-04T11:11:09.6074898Z * [new branch] gh/anijain2305/894/base -> origin/gh/anijain2305/894/base 2025-12-04T11:11:09.6075090Z * [new branch] gh/anijain2305/894/head -> origin/gh/anijain2305/894/head 2025-12-04T11:11:09.6075279Z * [new branch] gh/anijain2305/894/orig -> origin/gh/anijain2305/894/orig 2025-12-04T11:11:09.6075474Z * [new branch] gh/anijain2305/895/base -> origin/gh/anijain2305/895/base 2025-12-04T11:11:09.6075662Z * [new branch] gh/anijain2305/895/head -> origin/gh/anijain2305/895/head 2025-12-04T11:11:09.6075862Z * [new branch] gh/anijain2305/895/orig -> origin/gh/anijain2305/895/orig 2025-12-04T11:11:09.6076053Z * [new branch] gh/anijain2305/910/base -> origin/gh/anijain2305/910/base 2025-12-04T11:11:09.6076245Z * [new branch] gh/anijain2305/910/head -> origin/gh/anijain2305/910/head 2025-12-04T11:11:09.6076439Z * [new branch] gh/anijain2305/910/orig -> origin/gh/anijain2305/910/orig 2025-12-04T11:11:09.6076631Z * [new branch] gh/anijain2305/919/base -> origin/gh/anijain2305/919/base 2025-12-04T11:11:09.6076820Z * [new branch] gh/anijain2305/919/head -> origin/gh/anijain2305/919/head 2025-12-04T11:11:09.6077014Z * [new branch] gh/anijain2305/919/orig -> origin/gh/anijain2305/919/orig 2025-12-04T11:11:09.6077208Z * [new branch] gh/anijain2305/922/base -> origin/gh/anijain2305/922/base 2025-12-04T11:11:09.6077397Z * [new branch] gh/anijain2305/922/head -> origin/gh/anijain2305/922/head 2025-12-04T11:11:09.6077591Z * [new branch] gh/anijain2305/922/orig -> origin/gh/anijain2305/922/orig 2025-12-04T11:11:09.6077784Z * [new branch] gh/anijain2305/932/base -> origin/gh/anijain2305/932/base 2025-12-04T11:11:09.6077977Z * [new branch] gh/anijain2305/932/head -> origin/gh/anijain2305/932/head 2025-12-04T11:11:09.6078206Z * [new branch] gh/anijain2305/932/orig -> origin/gh/anijain2305/932/orig 2025-12-04T11:11:09.6078400Z * [new branch] gh/anijain2305/940/base -> origin/gh/anijain2305/940/base 2025-12-04T11:11:09.6078589Z * [new branch] gh/anijain2305/940/head -> origin/gh/anijain2305/940/head 2025-12-04T11:11:09.6078781Z * [new branch] gh/anijain2305/940/orig -> origin/gh/anijain2305/940/orig 2025-12-04T11:11:09.6078976Z * [new branch] gh/anijain2305/941/base -> origin/gh/anijain2305/941/base 2025-12-04T11:11:09.6079168Z * [new branch] gh/anijain2305/941/head -> origin/gh/anijain2305/941/head 2025-12-04T11:11:09.6079361Z * [new branch] gh/anijain2305/941/orig -> origin/gh/anijain2305/941/orig 2025-12-04T11:11:09.6079555Z * [new branch] gh/anijain2305/942/base -> origin/gh/anijain2305/942/base 2025-12-04T11:11:09.6079746Z * [new branch] gh/anijain2305/942/head -> origin/gh/anijain2305/942/head 2025-12-04T11:11:09.6079940Z * [new branch] gh/anijain2305/942/orig -> origin/gh/anijain2305/942/orig 2025-12-04T11:11:09.6080130Z * [new branch] gh/anijain2305/943/base -> origin/gh/anijain2305/943/base 2025-12-04T11:11:09.6080323Z * [new branch] gh/anijain2305/943/head -> origin/gh/anijain2305/943/head 2025-12-04T11:11:09.6080516Z * [new branch] gh/anijain2305/943/orig -> origin/gh/anijain2305/943/orig 2025-12-04T11:11:09.6080705Z * [new branch] gh/anijain2305/944/base -> origin/gh/anijain2305/944/base 2025-12-04T11:11:09.6080934Z * [new branch] gh/anijain2305/944/head -> origin/gh/anijain2305/944/head 2025-12-04T11:11:09.6081129Z * [new branch] gh/anijain2305/944/orig -> origin/gh/anijain2305/944/orig 2025-12-04T11:11:09.6081352Z * [new branch] gh/anijain2305/945/base -> origin/gh/anijain2305/945/base 2025-12-04T11:11:09.6081545Z * [new branch] gh/anijain2305/945/head -> origin/gh/anijain2305/945/head 2025-12-04T11:11:09.6081740Z * [new branch] gh/anijain2305/945/orig -> origin/gh/anijain2305/945/orig 2025-12-04T11:11:09.6081930Z * [new branch] gh/anijain2305/946/base -> origin/gh/anijain2305/946/base 2025-12-04T11:11:09.6082123Z * [new branch] gh/anijain2305/946/head -> origin/gh/anijain2305/946/head 2025-12-04T11:11:09.6082316Z * [new branch] gh/anijain2305/946/orig -> origin/gh/anijain2305/946/orig 2025-12-04T11:11:09.6082506Z * [new branch] gh/anijain2305/947/base -> origin/gh/anijain2305/947/base 2025-12-04T11:11:09.6082701Z * [new branch] gh/anijain2305/947/head -> origin/gh/anijain2305/947/head 2025-12-04T11:11:09.6082897Z * [new branch] gh/anijain2305/947/orig -> origin/gh/anijain2305/947/orig 2025-12-04T11:11:09.6083092Z * [new branch] gh/anijain2305/948/base -> origin/gh/anijain2305/948/base 2025-12-04T11:11:09.6083287Z * [new branch] gh/anijain2305/948/head -> origin/gh/anijain2305/948/head 2025-12-04T11:11:09.6083481Z * [new branch] gh/anijain2305/948/orig -> origin/gh/anijain2305/948/orig 2025-12-04T11:11:09.6083670Z * [new branch] gh/anijain2305/949/base -> origin/gh/anijain2305/949/base 2025-12-04T11:11:09.6083865Z * [new branch] gh/anijain2305/949/head -> origin/gh/anijain2305/949/head 2025-12-04T11:11:09.6084057Z * [new branch] gh/anijain2305/949/orig -> origin/gh/anijain2305/949/orig 2025-12-04T11:11:09.6084253Z * [new branch] gh/anijain2305/950/base -> origin/gh/anijain2305/950/base 2025-12-04T11:11:09.6084447Z * [new branch] gh/anijain2305/950/head -> origin/gh/anijain2305/950/head 2025-12-04T11:11:09.6084635Z * [new branch] gh/anijain2305/950/orig -> origin/gh/anijain2305/950/orig 2025-12-04T11:11:09.6084834Z * [new branch] gh/anijain2305/951/base -> origin/gh/anijain2305/951/base 2025-12-04T11:11:09.6085029Z * [new branch] gh/anijain2305/951/head -> origin/gh/anijain2305/951/head 2025-12-04T11:11:09.6085220Z * [new branch] gh/anijain2305/951/orig -> origin/gh/anijain2305/951/orig 2025-12-04T11:11:09.6085414Z * [new branch] gh/anijain2305/952/base -> origin/gh/anijain2305/952/base 2025-12-04T11:11:09.6085607Z * [new branch] gh/anijain2305/952/head -> origin/gh/anijain2305/952/head 2025-12-04T11:11:09.6085794Z * [new branch] gh/anijain2305/952/orig -> origin/gh/anijain2305/952/orig 2025-12-04T11:11:09.6085991Z * [new branch] gh/anijain2305/953/base -> origin/gh/anijain2305/953/base 2025-12-04T11:11:09.6086185Z * [new branch] gh/anijain2305/953/head -> origin/gh/anijain2305/953/head 2025-12-04T11:11:09.6086377Z * [new branch] gh/anijain2305/953/orig -> origin/gh/anijain2305/953/orig 2025-12-04T11:11:09.6086569Z * [new branch] gh/anijain2305/954/base -> origin/gh/anijain2305/954/base 2025-12-04T11:11:09.6086763Z * [new branch] gh/anijain2305/954/head -> origin/gh/anijain2305/954/head 2025-12-04T11:11:09.6086951Z * [new branch] gh/anijain2305/954/orig -> origin/gh/anijain2305/954/orig 2025-12-04T11:11:09.6087144Z * [new branch] gh/anijain2305/955/base -> origin/gh/anijain2305/955/base 2025-12-04T11:11:09.6087336Z * [new branch] gh/anijain2305/955/head -> origin/gh/anijain2305/955/head 2025-12-04T11:11:09.6087525Z * [new branch] gh/anijain2305/955/orig -> origin/gh/anijain2305/955/orig 2025-12-04T11:11:09.6087748Z * [new branch] gh/anijain2305/956/base -> origin/gh/anijain2305/956/base 2025-12-04T11:11:09.6087940Z * [new branch] gh/anijain2305/956/head -> origin/gh/anijain2305/956/head 2025-12-04T11:11:09.6088187Z * [new branch] gh/anijain2305/956/orig -> origin/gh/anijain2305/956/orig 2025-12-04T11:11:09.6088383Z * [new branch] gh/anijain2305/957/base -> origin/gh/anijain2305/957/base 2025-12-04T11:11:09.6088576Z * [new branch] gh/anijain2305/957/head -> origin/gh/anijain2305/957/head 2025-12-04T11:11:09.6088764Z * [new branch] gh/anijain2305/957/orig -> origin/gh/anijain2305/957/orig 2025-12-04T11:11:09.6088953Z * [new branch] gh/anijain2305/958/base -> origin/gh/anijain2305/958/base 2025-12-04T11:11:09.6089145Z * [new branch] gh/anijain2305/958/head -> origin/gh/anijain2305/958/head 2025-12-04T11:11:09.6089338Z * [new branch] gh/anijain2305/958/orig -> origin/gh/anijain2305/958/orig 2025-12-04T11:11:09.6089526Z * [new branch] gh/anijain2305/959/base -> origin/gh/anijain2305/959/base 2025-12-04T11:11:09.6089711Z * [new branch] gh/anijain2305/959/head -> origin/gh/anijain2305/959/head 2025-12-04T11:11:09.6089905Z * [new branch] gh/anijain2305/959/orig -> origin/gh/anijain2305/959/orig 2025-12-04T11:11:09.6090093Z * [new branch] gh/anijain2305/960/base -> origin/gh/anijain2305/960/base 2025-12-04T11:11:09.6090279Z * [new branch] gh/anijain2305/960/head -> origin/gh/anijain2305/960/head 2025-12-04T11:11:09.6090472Z * [new branch] gh/anijain2305/960/orig -> origin/gh/anijain2305/960/orig 2025-12-04T11:11:09.6090664Z * [new branch] gh/anijain2305/961/base -> origin/gh/anijain2305/961/base 2025-12-04T11:11:09.6090849Z * [new branch] gh/anijain2305/961/head -> origin/gh/anijain2305/961/head 2025-12-04T11:11:09.6091041Z * [new branch] gh/anijain2305/961/orig -> origin/gh/anijain2305/961/orig 2025-12-04T11:11:09.6091232Z * [new branch] gh/anijain2305/962/base -> origin/gh/anijain2305/962/base 2025-12-04T11:11:09.6091421Z * [new branch] gh/anijain2305/962/head -> origin/gh/anijain2305/962/head 2025-12-04T11:11:09.6091612Z * [new branch] gh/anijain2305/962/orig -> origin/gh/anijain2305/962/orig 2025-12-04T11:11:09.6091800Z * [new branch] gh/anijain2305/963/base -> origin/gh/anijain2305/963/base 2025-12-04T11:11:09.6091988Z * [new branch] gh/anijain2305/963/head -> origin/gh/anijain2305/963/head 2025-12-04T11:11:09.6092180Z * [new branch] gh/anijain2305/963/orig -> origin/gh/anijain2305/963/orig 2025-12-04T11:11:09.6092371Z * [new branch] gh/anijain2305/964/base -> origin/gh/anijain2305/964/base 2025-12-04T11:11:09.6092558Z * [new branch] gh/anijain2305/964/head -> origin/gh/anijain2305/964/head 2025-12-04T11:11:09.6092748Z * [new branch] gh/anijain2305/964/orig -> origin/gh/anijain2305/964/orig 2025-12-04T11:11:09.6092939Z * [new branch] gh/anijain2305/965/base -> origin/gh/anijain2305/965/base 2025-12-04T11:11:09.6093131Z * [new branch] gh/anijain2305/965/head -> origin/gh/anijain2305/965/head 2025-12-04T11:11:09.6093324Z * [new branch] gh/anijain2305/965/orig -> origin/gh/anijain2305/965/orig 2025-12-04T11:11:09.6093515Z * [new branch] gh/anijain2305/966/base -> origin/gh/anijain2305/966/base 2025-12-04T11:11:09.6093701Z * [new branch] gh/anijain2305/966/head -> origin/gh/anijain2305/966/head 2025-12-04T11:11:09.6093891Z * [new branch] gh/anijain2305/966/orig -> origin/gh/anijain2305/966/orig 2025-12-04T11:11:09.6094084Z * [new branch] gh/anijain2305/967/base -> origin/gh/anijain2305/967/base 2025-12-04T11:11:09.6094309Z * [new branch] gh/anijain2305/967/head -> origin/gh/anijain2305/967/head 2025-12-04T11:11:09.6094500Z * [new branch] gh/anijain2305/967/orig -> origin/gh/anijain2305/967/orig 2025-12-04T11:11:09.6094687Z * [new branch] gh/anijain2305/968/base -> origin/gh/anijain2305/968/base 2025-12-04T11:11:09.6094901Z * [new branch] gh/anijain2305/968/head -> origin/gh/anijain2305/968/head 2025-12-04T11:11:09.6095091Z * [new branch] gh/anijain2305/968/orig -> origin/gh/anijain2305/968/orig 2025-12-04T11:11:09.6095277Z * [new branch] gh/anijain2305/969/base -> origin/gh/anijain2305/969/base 2025-12-04T11:11:09.6095466Z * [new branch] gh/anijain2305/969/head -> origin/gh/anijain2305/969/head 2025-12-04T11:11:09.6095657Z * [new branch] gh/anijain2305/969/orig -> origin/gh/anijain2305/969/orig 2025-12-04T11:11:09.6095846Z * [new branch] gh/anijain2305/970/base -> origin/gh/anijain2305/970/base 2025-12-04T11:11:09.6096043Z * [new branch] gh/anijain2305/970/head -> origin/gh/anijain2305/970/head 2025-12-04T11:11:09.6096234Z * [new branch] gh/anijain2305/970/orig -> origin/gh/anijain2305/970/orig 2025-12-04T11:11:09.6096420Z * [new branch] gh/anjali411/216/base -> origin/gh/anjali411/216/base 2025-12-04T11:11:09.6096608Z * [new branch] gh/anjali411/216/head -> origin/gh/anjali411/216/head 2025-12-04T11:11:09.6096794Z * [new branch] gh/anjali411/216/orig -> origin/gh/anjali411/216/orig 2025-12-04T11:11:09.6096979Z * [new branch] gh/anshul-si/1/base -> origin/gh/anshul-si/1/base 2025-12-04T11:11:09.6097163Z * [new branch] gh/anshul-si/1/head -> origin/gh/anshul-si/1/head 2025-12-04T11:11:09.6097344Z * [new branch] gh/anshul-si/2/base -> origin/gh/anshul-si/2/base 2025-12-04T11:11:09.6097522Z * [new branch] gh/anshul-si/2/head -> origin/gh/anshul-si/2/head 2025-12-04T11:11:09.6097705Z * [new branch] gh/anshul-si/3/base -> origin/gh/anshul-si/3/base 2025-12-04T11:11:09.6097886Z * [new branch] gh/anshul-si/3/head -> origin/gh/anshul-si/3/head 2025-12-04T11:11:09.6098065Z * [new branch] gh/anshul-si/4/base -> origin/gh/anshul-si/4/base 2025-12-04T11:11:09.6098274Z * [new branch] gh/anshul-si/4/head -> origin/gh/anshul-si/4/head 2025-12-04T11:11:09.6098455Z * [new branch] gh/anshul-si/5/base -> origin/gh/anshul-si/5/base 2025-12-04T11:11:09.6098635Z * [new branch] gh/anshul-si/5/head -> origin/gh/anshul-si/5/head 2025-12-04T11:11:09.6098818Z * [new branch] gh/anshul-si/53/base -> origin/gh/anshul-si/53/base 2025-12-04T11:11:09.6099001Z * [new branch] gh/anshul-si/53/head -> origin/gh/anshul-si/53/head 2025-12-04T11:11:09.6099185Z * [new branch] gh/anshul-si/58/base -> origin/gh/anshul-si/58/base 2025-12-04T11:11:09.6099371Z * [new branch] gh/anshul-si/58/head -> origin/gh/anshul-si/58/head 2025-12-04T11:11:09.6099551Z * [new branch] gh/anshul-si/66/base -> origin/gh/anshul-si/66/base 2025-12-04T11:11:09.6099734Z * [new branch] gh/anshul-si/66/head -> origin/gh/anshul-si/66/head 2025-12-04T11:11:09.6099916Z * [new branch] gh/anshul-si/66/orig -> origin/gh/anshul-si/66/orig 2025-12-04T11:11:09.6100094Z * [new branch] gh/anshul-si/67/base -> origin/gh/anshul-si/67/base 2025-12-04T11:11:09.6100276Z * [new branch] gh/anshul-si/67/head -> origin/gh/anshul-si/67/head 2025-12-04T11:11:09.6100458Z * [new branch] gh/anshul-si/67/orig -> origin/gh/anshul-si/67/orig 2025-12-04T11:11:09.6100638Z * [new branch] gh/anshul-si/68/base -> origin/gh/anshul-si/68/base 2025-12-04T11:11:09.6100820Z * [new branch] gh/anshul-si/68/head -> origin/gh/anshul-si/68/head 2025-12-04T11:11:09.6101043Z * [new branch] gh/anshul-si/68/orig -> origin/gh/anshul-si/68/orig 2025-12-04T11:11:09.6101225Z * [new branch] gh/anshul-si/69/base -> origin/gh/anshul-si/69/base 2025-12-04T11:11:09.6101438Z * [new branch] gh/anshul-si/69/head -> origin/gh/anshul-si/69/head 2025-12-04T11:11:09.6101619Z * [new branch] gh/anshul-si/69/orig -> origin/gh/anshul-si/69/orig 2025-12-04T11:11:09.6101799Z * [new branch] gh/anshul-si/70/base -> origin/gh/anshul-si/70/base 2025-12-04T11:11:09.6101981Z * [new branch] gh/anshul-si/70/head -> origin/gh/anshul-si/70/head 2025-12-04T11:11:09.6102163Z * [new branch] gh/anshul-si/70/orig -> origin/gh/anshul-si/70/orig 2025-12-04T11:11:09.6102342Z * [new branch] gh/anshul-si/71/base -> origin/gh/anshul-si/71/base 2025-12-04T11:11:09.6102523Z * [new branch] gh/anshul-si/71/head -> origin/gh/anshul-si/71/head 2025-12-04T11:11:09.6102709Z * [new branch] gh/anshul-si/71/orig -> origin/gh/anshul-si/71/orig 2025-12-04T11:11:09.6102887Z * [new branch] gh/anshul-si/72/base -> origin/gh/anshul-si/72/base 2025-12-04T11:11:09.6103072Z * [new branch] gh/anshul-si/72/head -> origin/gh/anshul-si/72/head 2025-12-04T11:11:09.6103252Z * [new branch] gh/anshul-si/72/orig -> origin/gh/anshul-si/72/orig 2025-12-04T11:11:09.6103434Z * [new branch] gh/anshul-si/73/base -> origin/gh/anshul-si/73/base 2025-12-04T11:11:09.6103615Z * [new branch] gh/anshul-si/73/head -> origin/gh/anshul-si/73/head 2025-12-04T11:11:09.6103793Z * [new branch] gh/anshul-si/73/orig -> origin/gh/anshul-si/73/orig 2025-12-04T11:11:09.6103976Z * [new branch] gh/aorenste/132/base -> origin/gh/aorenste/132/base 2025-12-04T11:11:09.6104160Z * [new branch] gh/aorenste/132/head -> origin/gh/aorenste/132/head 2025-12-04T11:11:09.6104344Z * [new branch] gh/aorenste/134/base -> origin/gh/aorenste/134/base 2025-12-04T11:11:09.6104529Z * [new branch] gh/aorenste/134/head -> origin/gh/aorenste/134/head 2025-12-04T11:11:09.6104720Z * [new branch] gh/aorenste/134/orig -> origin/gh/aorenste/134/orig 2025-12-04T11:11:09.6104901Z * [new branch] gh/aorenste/139/base -> origin/gh/aorenste/139/base 2025-12-04T11:11:09.6105084Z * [new branch] gh/aorenste/139/head -> origin/gh/aorenste/139/head 2025-12-04T11:11:09.6105278Z * [new branch] gh/aorenste/139/orig -> origin/gh/aorenste/139/orig 2025-12-04T11:11:09.6105459Z * [new branch] gh/aorenste/141/base -> origin/gh/aorenste/141/base 2025-12-04T11:11:09.6105644Z * [new branch] gh/aorenste/141/head -> origin/gh/aorenste/141/head 2025-12-04T11:11:09.6105834Z * [new branch] gh/aorenste/145/base -> origin/gh/aorenste/145/base 2025-12-04T11:11:09.6106021Z * [new branch] gh/aorenste/145/head -> origin/gh/aorenste/145/head 2025-12-04T11:11:09.6106210Z * [new branch] gh/aorenste/145/orig -> origin/gh/aorenste/145/orig 2025-12-04T11:11:09.6106407Z * [new branch] gh/aorenste/146/base -> origin/gh/aorenste/146/base 2025-12-04T11:11:09.6106593Z * [new branch] gh/aorenste/146/head -> origin/gh/aorenste/146/head 2025-12-04T11:11:09.6106782Z * [new branch] gh/aorenste/146/orig -> origin/gh/aorenste/146/orig 2025-12-04T11:11:09.6106971Z * [new branch] gh/aorenste/147/base -> origin/gh/aorenste/147/base 2025-12-04T11:11:09.6107153Z * [new branch] gh/aorenste/147/head -> origin/gh/aorenste/147/head 2025-12-04T11:11:09.6107339Z * [new branch] gh/aorenste/147/orig -> origin/gh/aorenste/147/orig 2025-12-04T11:11:09.6107551Z * [new branch] gh/aorenste/148/base -> origin/gh/aorenste/148/base 2025-12-04T11:11:09.6107737Z * [new branch] gh/aorenste/148/head -> origin/gh/aorenste/148/head 2025-12-04T11:11:09.6107927Z * [new branch] gh/aorenste/148/orig -> origin/gh/aorenste/148/orig 2025-12-04T11:11:09.6108136Z * [new branch] gh/aorenste/149/base -> origin/gh/aorenste/149/base 2025-12-04T11:11:09.6108376Z * [new branch] gh/aorenste/149/head -> origin/gh/aorenste/149/head 2025-12-04T11:11:09.6108563Z * [new branch] gh/aorenste/149/orig -> origin/gh/aorenste/149/orig 2025-12-04T11:11:09.6108748Z * [new branch] gh/aorenste/150/base -> origin/gh/aorenste/150/base 2025-12-04T11:11:09.6108934Z * [new branch] gh/aorenste/150/head -> origin/gh/aorenste/150/head 2025-12-04T11:11:09.6109118Z * [new branch] gh/aorenste/150/orig -> origin/gh/aorenste/150/orig 2025-12-04T11:11:09.6109303Z * [new branch] gh/aorenste/151/base -> origin/gh/aorenste/151/base 2025-12-04T11:11:09.6109488Z * [new branch] gh/aorenste/151/head -> origin/gh/aorenste/151/head 2025-12-04T11:11:09.6109677Z * [new branch] gh/aorenste/151/orig -> origin/gh/aorenste/151/orig 2025-12-04T11:11:09.6109867Z * [new branch] gh/aorenste/152/base -> origin/gh/aorenste/152/base 2025-12-04T11:11:09.6110053Z * [new branch] gh/aorenste/152/head -> origin/gh/aorenste/152/head 2025-12-04T11:11:09.6110244Z * [new branch] gh/aorenste/152/orig -> origin/gh/aorenste/152/orig 2025-12-04T11:11:09.6110430Z * [new branch] gh/aorenste/153/base -> origin/gh/aorenste/153/base 2025-12-04T11:11:09.6110619Z * [new branch] gh/aorenste/153/head -> origin/gh/aorenste/153/head 2025-12-04T11:11:09.6110806Z * [new branch] gh/aorenste/153/orig -> origin/gh/aorenste/153/orig 2025-12-04T11:11:09.6110990Z * [new branch] gh/aorenste/154/base -> origin/gh/aorenste/154/base 2025-12-04T11:11:09.6111173Z * [new branch] gh/aorenste/154/head -> origin/gh/aorenste/154/head 2025-12-04T11:11:09.6111355Z * [new branch] gh/aorenste/154/orig -> origin/gh/aorenste/154/orig 2025-12-04T11:11:09.6111547Z * [new branch] gh/aorenste/155/base -> origin/gh/aorenste/155/base 2025-12-04T11:11:09.6111735Z * [new branch] gh/aorenste/155/head -> origin/gh/aorenste/155/head 2025-12-04T11:11:09.6111916Z * [new branch] gh/aorenste/155/orig -> origin/gh/aorenste/155/orig 2025-12-04T11:11:09.6112100Z * [new branch] gh/aorenste/156/base -> origin/gh/aorenste/156/base 2025-12-04T11:11:09.6112285Z * [new branch] gh/aorenste/156/head -> origin/gh/aorenste/156/head 2025-12-04T11:11:09.6112468Z * [new branch] gh/aorenste/156/orig -> origin/gh/aorenste/156/orig 2025-12-04T11:11:09.6112656Z * [new branch] gh/aorenste/157/base -> origin/gh/aorenste/157/base 2025-12-04T11:11:09.6112840Z * [new branch] gh/aorenste/157/head -> origin/gh/aorenste/157/head 2025-12-04T11:11:09.6113022Z * [new branch] gh/aorenste/157/orig -> origin/gh/aorenste/157/orig 2025-12-04T11:11:09.6113212Z * [new branch] gh/aorenste/158/base -> origin/gh/aorenste/158/base 2025-12-04T11:11:09.6113401Z * [new branch] gh/aorenste/158/head -> origin/gh/aorenste/158/head 2025-12-04T11:11:09.6113586Z * [new branch] gh/aorenste/158/orig -> origin/gh/aorenste/158/orig 2025-12-04T11:11:09.6113774Z * [new branch] gh/aorenste/159/base -> origin/gh/aorenste/159/base 2025-12-04T11:11:09.6113963Z * [new branch] gh/aorenste/159/head -> origin/gh/aorenste/159/head 2025-12-04T11:11:09.6114144Z * [new branch] gh/aorenste/159/orig -> origin/gh/aorenste/159/orig 2025-12-04T11:11:09.6114378Z * [new branch] gh/avikchaudhuri/1/base -> origin/gh/avikchaudhuri/1/base 2025-12-04T11:11:09.6114582Z * [new branch] gh/avikchaudhuri/1/head -> origin/gh/avikchaudhuri/1/head 2025-12-04T11:11:09.6114808Z * [new branch] gh/avikchaudhuri/2/base -> origin/gh/avikchaudhuri/2/base 2025-12-04T11:11:09.6115005Z * [new branch] gh/avikchaudhuri/2/head -> origin/gh/avikchaudhuri/2/head 2025-12-04T11:11:09.6115200Z * [new branch] gh/avikchaudhuri/2/orig -> origin/gh/avikchaudhuri/2/orig 2025-12-04T11:11:09.6115388Z * [new branch] gh/bdhirsh/666/base -> origin/gh/bdhirsh/666/base 2025-12-04T11:11:09.6115578Z * [new branch] gh/bdhirsh/666/head -> origin/gh/bdhirsh/666/head 2025-12-04T11:11:09.6115762Z * [new branch] gh/bdhirsh/666/orig -> origin/gh/bdhirsh/666/orig 2025-12-04T11:11:09.6115949Z * [new branch] gh/bdhirsh/668/base -> origin/gh/bdhirsh/668/base 2025-12-04T11:11:09.6116136Z * [new branch] gh/bdhirsh/668/head -> origin/gh/bdhirsh/668/head 2025-12-04T11:11:09.6116318Z * [new branch] gh/bdhirsh/668/orig -> origin/gh/bdhirsh/668/orig 2025-12-04T11:11:09.6116501Z * [new branch] gh/bdhirsh/669/base -> origin/gh/bdhirsh/669/base 2025-12-04T11:11:09.6116681Z * [new branch] gh/bdhirsh/669/head -> origin/gh/bdhirsh/669/head 2025-12-04T11:11:09.6116862Z * [new branch] gh/bdhirsh/669/orig -> origin/gh/bdhirsh/669/orig 2025-12-04T11:11:09.6117046Z * [new branch] gh/bdhirsh/670/base -> origin/gh/bdhirsh/670/base 2025-12-04T11:11:09.6117227Z * [new branch] gh/bdhirsh/670/head -> origin/gh/bdhirsh/670/head 2025-12-04T11:11:09.6117406Z * [new branch] gh/bdhirsh/670/orig -> origin/gh/bdhirsh/670/orig 2025-12-04T11:11:09.6117587Z * [new branch] gh/bdhirsh/672/base -> origin/gh/bdhirsh/672/base 2025-12-04T11:11:09.6117770Z * [new branch] gh/bdhirsh/672/head -> origin/gh/bdhirsh/672/head 2025-12-04T11:11:09.6117948Z * [new branch] gh/bdhirsh/672/orig -> origin/gh/bdhirsh/672/orig 2025-12-04T11:11:09.6118136Z * [new branch] gh/bdhirsh/675/base -> origin/gh/bdhirsh/675/base 2025-12-04T11:11:09.6118361Z * [new branch] gh/bdhirsh/675/head -> origin/gh/bdhirsh/675/head 2025-12-04T11:11:09.6118544Z * [new branch] gh/bdhirsh/675/orig -> origin/gh/bdhirsh/675/orig 2025-12-04T11:11:09.6118731Z * [new branch] gh/bdhirsh/676/base -> origin/gh/bdhirsh/676/base 2025-12-04T11:11:09.6118917Z * [new branch] gh/bdhirsh/676/head -> origin/gh/bdhirsh/676/head 2025-12-04T11:11:09.6119100Z * [new branch] gh/bdhirsh/676/orig -> origin/gh/bdhirsh/676/orig 2025-12-04T11:11:09.6119179Z * [new branch] gh/bdhirsh/677/base -> origin/gh/bdhirsh/677/base 2025-12-04T11:11:09.6119252Z * [new branch] gh/bdhirsh/677/head -> origin/gh/bdhirsh/677/head 2025-12-04T11:11:09.6119323Z * [new branch] gh/bdhirsh/677/orig -> origin/gh/bdhirsh/677/orig 2025-12-04T11:11:09.6119401Z * [new branch] gh/bdhirsh/678/base -> origin/gh/bdhirsh/678/base 2025-12-04T11:11:09.6119471Z * [new branch] gh/bdhirsh/678/head -> origin/gh/bdhirsh/678/head 2025-12-04T11:11:09.6119542Z * [new branch] gh/bdhirsh/678/orig -> origin/gh/bdhirsh/678/orig 2025-12-04T11:11:09.6119616Z * [new branch] gh/bdhirsh/679/base -> origin/gh/bdhirsh/679/base 2025-12-04T11:11:09.6119687Z * [new branch] gh/bdhirsh/679/head -> origin/gh/bdhirsh/679/head 2025-12-04T11:11:09.6119758Z * [new branch] gh/bdhirsh/679/orig -> origin/gh/bdhirsh/679/orig 2025-12-04T11:11:09.6119831Z * [new branch] gh/bdhirsh/680/base -> origin/gh/bdhirsh/680/base 2025-12-04T11:11:09.6119936Z * [new branch] gh/bdhirsh/680/head -> origin/gh/bdhirsh/680/head 2025-12-04T11:11:09.6120008Z * [new branch] gh/bdhirsh/680/orig -> origin/gh/bdhirsh/680/orig 2025-12-04T11:11:09.6120125Z * [new branch] gh/bdhirsh/681/base -> origin/gh/bdhirsh/681/base 2025-12-04T11:11:09.6120196Z * [new branch] gh/bdhirsh/681/head -> origin/gh/bdhirsh/681/head 2025-12-04T11:11:09.6120270Z * [new branch] gh/bdhirsh/681/orig -> origin/gh/bdhirsh/681/orig 2025-12-04T11:11:09.6120369Z * [new branch] gh/benjaminglass1/101/base -> origin/gh/benjaminglass1/101/base 2025-12-04T11:11:09.6120462Z * [new branch] gh/benjaminglass1/101/head -> origin/gh/benjaminglass1/101/head 2025-12-04T11:11:09.6120552Z * [new branch] gh/benjaminglass1/101/orig -> origin/gh/benjaminglass1/101/orig 2025-12-04T11:11:09.6120650Z * [new branch] gh/benjaminglass1/102/base -> origin/gh/benjaminglass1/102/base 2025-12-04T11:11:09.6120739Z * [new branch] gh/benjaminglass1/102/head -> origin/gh/benjaminglass1/102/head 2025-12-04T11:11:09.6120829Z * [new branch] gh/benjaminglass1/102/orig -> origin/gh/benjaminglass1/102/orig 2025-12-04T11:11:09.6120917Z * [new branch] gh/benjaminglass1/106/base -> origin/gh/benjaminglass1/106/base 2025-12-04T11:11:09.6121003Z * [new branch] gh/benjaminglass1/106/head -> origin/gh/benjaminglass1/106/head 2025-12-04T11:11:09.6121092Z * [new branch] gh/benjaminglass1/106/orig -> origin/gh/benjaminglass1/106/orig 2025-12-04T11:11:09.6121179Z * [new branch] gh/benjaminglass1/107/base -> origin/gh/benjaminglass1/107/base 2025-12-04T11:11:09.6121266Z * [new branch] gh/benjaminglass1/107/head -> origin/gh/benjaminglass1/107/head 2025-12-04T11:11:09.6121360Z * [new branch] gh/benjaminglass1/107/orig -> origin/gh/benjaminglass1/107/orig 2025-12-04T11:11:09.6121452Z * [new branch] gh/benjaminglass1/108/base -> origin/gh/benjaminglass1/108/base 2025-12-04T11:11:09.6121539Z * [new branch] gh/benjaminglass1/108/head -> origin/gh/benjaminglass1/108/head 2025-12-04T11:11:09.6121630Z * [new branch] gh/benjaminglass1/108/orig -> origin/gh/benjaminglass1/108/orig 2025-12-04T11:11:09.6121716Z * [new branch] gh/benjaminglass1/109/base -> origin/gh/benjaminglass1/109/base 2025-12-04T11:11:09.6121802Z * [new branch] gh/benjaminglass1/109/head -> origin/gh/benjaminglass1/109/head 2025-12-04T11:11:09.6121892Z * [new branch] gh/benjaminglass1/109/orig -> origin/gh/benjaminglass1/109/orig 2025-12-04T11:11:09.6121981Z * [new branch] gh/benjaminglass1/97/base -> origin/gh/benjaminglass1/97/base 2025-12-04T11:11:09.6122069Z * [new branch] gh/benjaminglass1/97/head -> origin/gh/benjaminglass1/97/head 2025-12-04T11:11:09.6122160Z * [new branch] gh/benjaminglass1/97/orig -> origin/gh/benjaminglass1/97/orig 2025-12-04T11:11:09.6122242Z * [new branch] gh/bobrenjc93/570/base -> origin/gh/bobrenjc93/570/base 2025-12-04T11:11:09.6122327Z * [new branch] gh/bobrenjc93/570/head -> origin/gh/bobrenjc93/570/head 2025-12-04T11:11:09.6122404Z * [new branch] gh/bobrenjc93/570/orig -> origin/gh/bobrenjc93/570/orig 2025-12-04T11:11:09.6122482Z * [new branch] gh/bobrenjc93/604/base -> origin/gh/bobrenjc93/604/base 2025-12-04T11:11:09.6122563Z * [new branch] gh/bobrenjc93/604/head -> origin/gh/bobrenjc93/604/head 2025-12-04T11:11:09.6122639Z * [new branch] gh/bobrenjc93/604/orig -> origin/gh/bobrenjc93/604/orig 2025-12-04T11:11:09.6122716Z * [new branch] gh/bobrenjc93/638/base -> origin/gh/bobrenjc93/638/base 2025-12-04T11:11:09.6122797Z * [new branch] gh/bobrenjc93/638/head -> origin/gh/bobrenjc93/638/head 2025-12-04T11:11:09.6122891Z * [new branch] gh/bobrenjc93/638/orig -> origin/gh/bobrenjc93/638/orig 2025-12-04T11:11:09.6122966Z * [new branch] gh/bobrenjc93/653/base -> origin/gh/bobrenjc93/653/base 2025-12-04T11:11:09.6123069Z * [new branch] gh/bobrenjc93/653/head -> origin/gh/bobrenjc93/653/head 2025-12-04T11:11:09.6123146Z * [new branch] gh/bobrenjc93/653/orig -> origin/gh/bobrenjc93/653/orig 2025-12-04T11:11:09.6123221Z * [new branch] gh/bobrenjc93/654/base -> origin/gh/bobrenjc93/654/base 2025-12-04T11:11:09.6123305Z * [new branch] gh/bobrenjc93/654/head -> origin/gh/bobrenjc93/654/head 2025-12-04T11:11:09.6123381Z * [new branch] gh/bobrenjc93/654/orig -> origin/gh/bobrenjc93/654/orig 2025-12-04T11:11:09.6123457Z * [new branch] gh/bobrenjc93/657/base -> origin/gh/bobrenjc93/657/base 2025-12-04T11:11:09.6123539Z * [new branch] gh/bobrenjc93/657/head -> origin/gh/bobrenjc93/657/head 2025-12-04T11:11:09.6123615Z * [new branch] gh/bobrenjc93/657/orig -> origin/gh/bobrenjc93/657/orig 2025-12-04T11:11:09.6152903Z * [new branch] gh/bobrenjc93/672/base -> origin/gh/bobrenjc93/672/base 2025-12-04T11:11:09.6153027Z * [new branch] gh/bobrenjc93/672/head -> origin/gh/bobrenjc93/672/head 2025-12-04T11:11:09.6153108Z * [new branch] gh/bobrenjc93/672/orig -> origin/gh/bobrenjc93/672/orig 2025-12-04T11:11:09.6153184Z * [new branch] gh/bobrenjc93/679/base -> origin/gh/bobrenjc93/679/base 2025-12-04T11:11:09.6153259Z * [new branch] gh/bobrenjc93/679/head -> origin/gh/bobrenjc93/679/head 2025-12-04T11:11:09.6153332Z * [new branch] gh/bobrenjc93/679/orig -> origin/gh/bobrenjc93/679/orig 2025-12-04T11:11:09.6153425Z * [new branch] gh/bobrenjc93/680/base -> origin/gh/bobrenjc93/680/base 2025-12-04T11:11:09.6153503Z * [new branch] gh/bobrenjc93/680/head -> origin/gh/bobrenjc93/680/head 2025-12-04T11:11:09.6153589Z * [new branch] gh/bobrenjc93/680/orig -> origin/gh/bobrenjc93/680/orig 2025-12-04T11:11:09.6153664Z * [new branch] gh/bobrenjc93/681/base -> origin/gh/bobrenjc93/681/base 2025-12-04T11:11:09.6153766Z * [new branch] gh/bobrenjc93/681/head -> origin/gh/bobrenjc93/681/head 2025-12-04T11:11:09.6153852Z * [new branch] gh/bobrenjc93/681/orig -> origin/gh/bobrenjc93/681/orig 2025-12-04T11:11:09.6153940Z * [new branch] gh/bobrenjc93/682/base -> origin/gh/bobrenjc93/682/base 2025-12-04T11:11:09.6154032Z * [new branch] gh/bobrenjc93/682/head -> origin/gh/bobrenjc93/682/head 2025-12-04T11:11:09.6154117Z * [new branch] gh/bobrenjc93/682/orig -> origin/gh/bobrenjc93/682/orig 2025-12-04T11:11:09.6154202Z * [new branch] gh/bobrenjc93/683/base -> origin/gh/bobrenjc93/683/base 2025-12-04T11:11:09.6154283Z * [new branch] gh/bobrenjc93/683/head -> origin/gh/bobrenjc93/683/head 2025-12-04T11:11:09.6154397Z * [new branch] gh/bobrenjc93/683/orig -> origin/gh/bobrenjc93/683/orig 2025-12-04T11:11:09.6154488Z * [new branch] gh/bobrenjc93/684/base -> origin/gh/bobrenjc93/684/base 2025-12-04T11:11:09.6154564Z * [new branch] gh/bobrenjc93/684/head -> origin/gh/bobrenjc93/684/head 2025-12-04T11:11:09.6154639Z * [new branch] gh/bobrenjc93/684/orig -> origin/gh/bobrenjc93/684/orig 2025-12-04T11:11:09.6154720Z * [new branch] gh/bobrenjc93/685/base -> origin/gh/bobrenjc93/685/base 2025-12-04T11:11:09.6154796Z * [new branch] gh/bobrenjc93/685/head -> origin/gh/bobrenjc93/685/head 2025-12-04T11:11:09.6154873Z * [new branch] gh/bobrenjc93/685/orig -> origin/gh/bobrenjc93/685/orig 2025-12-04T11:11:09.6154954Z * [new branch] gh/bobrenjc93/686/base -> origin/gh/bobrenjc93/686/base 2025-12-04T11:11:09.6155100Z * [new branch] gh/bobrenjc93/686/head -> origin/gh/bobrenjc93/686/head 2025-12-04T11:11:09.6155178Z * [new branch] gh/bobrenjc93/686/orig -> origin/gh/bobrenjc93/686/orig 2025-12-04T11:11:09.6155286Z * [new branch] gh/bobrenjc93/687/base -> origin/gh/bobrenjc93/687/base 2025-12-04T11:11:09.6155363Z * [new branch] gh/bobrenjc93/687/head -> origin/gh/bobrenjc93/687/head 2025-12-04T11:11:09.6155439Z * [new branch] gh/bobrenjc93/687/orig -> origin/gh/bobrenjc93/687/orig 2025-12-04T11:11:09.6155522Z * [new branch] gh/bobrenjc93/688/base -> origin/gh/bobrenjc93/688/base 2025-12-04T11:11:09.6155598Z * [new branch] gh/bobrenjc93/688/head -> origin/gh/bobrenjc93/688/head 2025-12-04T11:11:09.6155673Z * [new branch] gh/bobrenjc93/688/orig -> origin/gh/bobrenjc93/688/orig 2025-12-04T11:11:09.6155762Z * [new branch] gh/bobrenjc93/689/base -> origin/gh/bobrenjc93/689/base 2025-12-04T11:11:09.6155838Z * [new branch] gh/bobrenjc93/689/head -> origin/gh/bobrenjc93/689/head 2025-12-04T11:11:09.6155913Z * [new branch] gh/bobrenjc93/689/orig -> origin/gh/bobrenjc93/689/orig 2025-12-04T11:11:09.6155997Z * [new branch] gh/bobrenjc93/690/base -> origin/gh/bobrenjc93/690/base 2025-12-04T11:11:09.6156073Z * [new branch] gh/bobrenjc93/690/head -> origin/gh/bobrenjc93/690/head 2025-12-04T11:11:09.6156154Z * [new branch] gh/bobrenjc93/690/orig -> origin/gh/bobrenjc93/690/orig 2025-12-04T11:11:09.6156230Z * [new branch] gh/bobrenjc93/691/base -> origin/gh/bobrenjc93/691/base 2025-12-04T11:11:09.6156307Z * [new branch] gh/bobrenjc93/691/head -> origin/gh/bobrenjc93/691/head 2025-12-04T11:11:09.6156386Z * [new branch] gh/bobrenjc93/691/orig -> origin/gh/bobrenjc93/691/orig 2025-12-04T11:11:09.6156464Z * [new branch] gh/bobrenjc93/692/base -> origin/gh/bobrenjc93/692/base 2025-12-04T11:11:09.6156540Z * [new branch] gh/bobrenjc93/692/head -> origin/gh/bobrenjc93/692/head 2025-12-04T11:11:09.6156622Z * [new branch] gh/bobrenjc93/692/orig -> origin/gh/bobrenjc93/692/orig 2025-12-04T11:11:09.6156700Z * [new branch] gh/bobrenjc93/693/base -> origin/gh/bobrenjc93/693/base 2025-12-04T11:11:09.6156776Z * [new branch] gh/bobrenjc93/693/head -> origin/gh/bobrenjc93/693/head 2025-12-04T11:11:09.6156858Z * [new branch] gh/bobrenjc93/693/orig -> origin/gh/bobrenjc93/693/orig 2025-12-04T11:11:09.6156934Z * [new branch] gh/bobrenjc93/694/base -> origin/gh/bobrenjc93/694/base 2025-12-04T11:11:09.6157010Z * [new branch] gh/bobrenjc93/694/head -> origin/gh/bobrenjc93/694/head 2025-12-04T11:11:09.6157091Z * [new branch] gh/bobrenjc93/694/orig -> origin/gh/bobrenjc93/694/orig 2025-12-04T11:11:09.6157165Z * [new branch] gh/bobrenjc93/695/base -> origin/gh/bobrenjc93/695/base 2025-12-04T11:11:09.6157240Z * [new branch] gh/bobrenjc93/695/head -> origin/gh/bobrenjc93/695/head 2025-12-04T11:11:09.6157326Z * [new branch] gh/bobrenjc93/695/orig -> origin/gh/bobrenjc93/695/orig 2025-12-04T11:11:09.6157399Z * [new branch] gh/c00w/23/base -> origin/gh/c00w/23/base 2025-12-04T11:11:09.6157469Z * [new branch] gh/c00w/23/head -> origin/gh/c00w/23/head 2025-12-04T11:11:09.6157542Z * [new branch] gh/c00w/53/base -> origin/gh/c00w/53/base 2025-12-04T11:11:09.6157610Z * [new branch] gh/c00w/53/head -> origin/gh/c00w/53/head 2025-12-04T11:11:09.6157677Z * [new branch] gh/c00w/53/orig -> origin/gh/c00w/53/orig 2025-12-04T11:11:09.6157749Z * [new branch] gh/c00w/54/base -> origin/gh/c00w/54/base 2025-12-04T11:11:09.6157842Z * [new branch] gh/c00w/54/head -> origin/gh/c00w/54/head 2025-12-04T11:11:09.6157915Z * [new branch] gh/c00w/54/orig -> origin/gh/c00w/54/orig 2025-12-04T11:11:09.6157981Z * [new branch] gh/c00w/56/base -> origin/gh/c00w/56/base 2025-12-04T11:11:09.6158070Z * [new branch] gh/c00w/56/head -> origin/gh/c00w/56/head 2025-12-04T11:11:09.6158142Z * [new branch] gh/c00w/56/orig -> origin/gh/c00w/56/orig 2025-12-04T11:11:09.6158245Z * [new branch] gh/c00w/57/base -> origin/gh/c00w/57/base 2025-12-04T11:11:09.6158312Z * [new branch] gh/c00w/57/head -> origin/gh/c00w/57/head 2025-12-04T11:11:09.6158384Z * [new branch] gh/c00w/57/orig -> origin/gh/c00w/57/orig 2025-12-04T11:11:09.6158447Z * [new branch] gh/c00w/58/base -> origin/gh/c00w/58/base 2025-12-04T11:11:09.6158514Z * [new branch] gh/c00w/58/head -> origin/gh/c00w/58/head 2025-12-04T11:11:09.6158587Z * [new branch] gh/c00w/58/orig -> origin/gh/c00w/58/orig 2025-12-04T11:11:09.6158666Z * [new branch] gh/clee2000/1/base -> origin/gh/clee2000/1/base 2025-12-04T11:11:09.6158747Z * [new branch] gh/clee2000/1/head -> origin/gh/clee2000/1/head 2025-12-04T11:11:09.6158825Z * [new branch] gh/clee2000/1/orig -> origin/gh/clee2000/1/orig 2025-12-04T11:11:09.6158911Z * [new branch] gh/coconutruben/1/base -> origin/gh/coconutruben/1/base 2025-12-04T11:11:09.6158992Z * [new branch] gh/coconutruben/1/head -> origin/gh/coconutruben/1/head 2025-12-04T11:11:09.6159084Z * [new branch] gh/coconutruben/55/base -> origin/gh/coconutruben/55/base 2025-12-04T11:11:09.6159166Z * [new branch] gh/coconutruben/55/head -> origin/gh/coconutruben/55/head 2025-12-04T11:11:09.6159247Z * [new branch] gh/coconutruben/55/orig -> origin/gh/coconutruben/55/orig 2025-12-04T11:11:09.6159332Z * [new branch] gh/coconutruben/57/base -> origin/gh/coconutruben/57/base 2025-12-04T11:11:09.6159411Z * [new branch] gh/coconutruben/57/head -> origin/gh/coconutruben/57/head 2025-12-04T11:11:09.6159494Z * [new branch] gh/coconutruben/57/orig -> origin/gh/coconutruben/57/orig 2025-12-04T11:11:09.6159578Z * [new branch] gh/coconutruben/70/base -> origin/gh/coconutruben/70/base 2025-12-04T11:11:09.6159655Z * [new branch] gh/coconutruben/70/head -> origin/gh/coconutruben/70/head 2025-12-04T11:11:09.6159735Z * [new branch] gh/coconutruben/70/orig -> origin/gh/coconutruben/70/orig 2025-12-04T11:11:09.6159821Z * [new branch] gh/coconutruben/71/base -> origin/gh/coconutruben/71/base 2025-12-04T11:11:09.6159900Z * [new branch] gh/coconutruben/71/head -> origin/gh/coconutruben/71/head 2025-12-04T11:11:09.6159987Z * [new branch] gh/coconutruben/71/orig -> origin/gh/coconutruben/71/orig 2025-12-04T11:11:09.6160067Z * [new branch] gh/coconutruben/72/base -> origin/gh/coconutruben/72/base 2025-12-04T11:11:09.6160148Z * [new branch] gh/coconutruben/72/head -> origin/gh/coconutruben/72/head 2025-12-04T11:11:09.6160231Z * [new branch] gh/coconutruben/72/orig -> origin/gh/coconutruben/72/orig 2025-12-04T11:11:09.6160310Z * [new branch] gh/coconutruben/73/base -> origin/gh/coconutruben/73/base 2025-12-04T11:11:09.6160389Z * [new branch] gh/coconutruben/73/head -> origin/gh/coconutruben/73/head 2025-12-04T11:11:09.6160474Z * [new branch] gh/coconutruben/73/orig -> origin/gh/coconutruben/73/orig 2025-12-04T11:11:09.6160553Z * [new branch] gh/coconutruben/74/base -> origin/gh/coconutruben/74/base 2025-12-04T11:11:09.6160632Z * [new branch] gh/coconutruben/74/head -> origin/gh/coconutruben/74/head 2025-12-04T11:11:09.6160747Z * [new branch] gh/coconutruben/74/orig -> origin/gh/coconutruben/74/orig 2025-12-04T11:11:09.6160824Z * [new branch] gh/coconutruben/79/base -> origin/gh/coconutruben/79/base 2025-12-04T11:11:09.6160947Z * [new branch] gh/coconutruben/79/head -> origin/gh/coconutruben/79/head 2025-12-04T11:11:09.6161032Z * [new branch] gh/coconutruben/79/orig -> origin/gh/coconutruben/79/orig 2025-12-04T11:11:09.6161112Z * [new branch] gh/coconutruben/80/base -> origin/gh/coconutruben/80/base 2025-12-04T11:11:09.6161192Z * [new branch] gh/coconutruben/80/head -> origin/gh/coconutruben/80/head 2025-12-04T11:11:09.6161278Z * [new branch] gh/coconutruben/80/orig -> origin/gh/coconutruben/80/orig 2025-12-04T11:11:09.6161358Z * [new branch] gh/coconutruben/82/base -> origin/gh/coconutruben/82/base 2025-12-04T11:11:09.6161438Z * [new branch] gh/coconutruben/82/head -> origin/gh/coconutruben/82/head 2025-12-04T11:11:09.6161524Z * [new branch] gh/coconutruben/82/orig -> origin/gh/coconutruben/82/orig 2025-12-04T11:11:09.6161602Z * [new branch] gh/coconutruben/83/base -> origin/gh/coconutruben/83/base 2025-12-04T11:11:09.6161690Z * [new branch] gh/coconutruben/83/head -> origin/gh/coconutruben/83/head 2025-12-04T11:11:09.6161769Z * [new branch] gh/coconutruben/83/orig -> origin/gh/coconutruben/83/orig 2025-12-04T11:11:09.6161847Z * [new branch] gh/coconutruben/84/base -> origin/gh/coconutruben/84/base 2025-12-04T11:11:09.6161932Z * [new branch] gh/coconutruben/84/head -> origin/gh/coconutruben/84/head 2025-12-04T11:11:09.6162012Z * [new branch] gh/coconutruben/84/orig -> origin/gh/coconutruben/84/orig 2025-12-04T11:11:09.6162091Z * [new branch] gh/coconutruben/85/base -> origin/gh/coconutruben/85/base 2025-12-04T11:11:09.6162174Z * [new branch] gh/coconutruben/85/head -> origin/gh/coconutruben/85/head 2025-12-04T11:11:09.6162255Z * [new branch] gh/coconutruben/85/orig -> origin/gh/coconutruben/85/orig 2025-12-04T11:11:09.6162338Z * [new branch] gh/coconutruben/86/base -> origin/gh/coconutruben/86/base 2025-12-04T11:11:09.6162424Z * [new branch] gh/coconutruben/86/head -> origin/gh/coconutruben/86/head 2025-12-04T11:11:09.6162503Z * [new branch] gh/coconutruben/86/orig -> origin/gh/coconutruben/86/orig 2025-12-04T11:11:09.6162587Z * [new branch] gh/colinchan15/1/base -> origin/gh/colinchan15/1/base 2025-12-04T11:11:09.6162672Z * [new branch] gh/colinchan15/1/head -> origin/gh/colinchan15/1/head 2025-12-04T11:11:09.6162750Z * [new branch] gh/colinchan15/2/base -> origin/gh/colinchan15/2/base 2025-12-04T11:11:09.6162827Z * [new branch] gh/colinchan15/2/head -> origin/gh/colinchan15/2/head 2025-12-04T11:11:09.6162910Z * [new branch] gh/colinchan15/3/base -> origin/gh/colinchan15/3/base 2025-12-04T11:11:09.6162986Z * [new branch] gh/colinchan15/3/head -> origin/gh/colinchan15/3/head 2025-12-04T11:11:09.6163065Z * [new branch] gh/colinchan15/6/base -> origin/gh/colinchan15/6/base 2025-12-04T11:11:09.6163147Z * [new branch] gh/colinchan15/6/head -> origin/gh/colinchan15/6/head 2025-12-04T11:11:09.6163218Z * [new branch] gh/d4l3k/1/base -> origin/gh/d4l3k/1/base 2025-12-04T11:11:09.6163294Z * [new branch] gh/d4l3k/1/head -> origin/gh/d4l3k/1/head 2025-12-04T11:11:09.6163362Z * [new branch] gh/d4l3k/2/base -> origin/gh/d4l3k/2/base 2025-12-04T11:11:09.6163429Z * [new branch] gh/d4l3k/2/head -> origin/gh/d4l3k/2/head 2025-12-04T11:11:09.6163501Z * [new branch] gh/d4l3k/2/orig -> origin/gh/d4l3k/2/orig 2025-12-04T11:11:09.6163592Z * [new branch] gh/d4l3k/3/base -> origin/gh/d4l3k/3/base 2025-12-04T11:11:09.6163657Z * [new branch] gh/d4l3k/3/head -> origin/gh/d4l3k/3/head 2025-12-04T11:11:09.6163752Z * [new branch] gh/d4l3k/3/orig -> origin/gh/d4l3k/3/orig 2025-12-04T11:11:09.6163820Z * [new branch] gh/d4l3k/4/base -> origin/gh/d4l3k/4/base 2025-12-04T11:11:09.6163901Z * [new branch] gh/d4l3k/4/head -> origin/gh/d4l3k/4/head 2025-12-04T11:11:09.6163974Z * [new branch] gh/d4l3k/4/orig -> origin/gh/d4l3k/4/orig 2025-12-04T11:11:09.6164042Z * [new branch] gh/d4l3k/5/base -> origin/gh/d4l3k/5/base 2025-12-04T11:11:09.6164109Z * [new branch] gh/d4l3k/5/orig -> origin/gh/d4l3k/5/orig 2025-12-04T11:11:09.6164210Z * [new branch] gh/davidberard98/392/base -> origin/gh/davidberard98/392/base 2025-12-04T11:11:09.6164302Z * [new branch] gh/davidberard98/392/head -> origin/gh/davidberard98/392/head 2025-12-04T11:11:09.6164389Z * [new branch] gh/davidberard98/392/orig -> origin/gh/davidberard98/392/orig 2025-12-04T11:11:09.6164482Z * [new branch] gh/davidberard98/399/base -> origin/gh/davidberard98/399/base 2025-12-04T11:11:09.6164568Z * [new branch] gh/davidberard98/399/head -> origin/gh/davidberard98/399/head 2025-12-04T11:11:09.6164653Z * [new branch] gh/davidberard98/399/orig -> origin/gh/davidberard98/399/orig 2025-12-04T11:11:09.6164738Z * [new branch] gh/desertfire/605/base -> origin/gh/desertfire/605/base 2025-12-04T11:11:09.6164814Z * [new branch] gh/desertfire/605/head -> origin/gh/desertfire/605/head 2025-12-04T11:11:09.6164888Z * [new branch] gh/desertfire/605/orig -> origin/gh/desertfire/605/orig 2025-12-04T11:11:09.6164965Z * [new branch] gh/desertfire/606/base -> origin/gh/desertfire/606/base 2025-12-04T11:11:09.6165044Z * [new branch] gh/desertfire/606/head -> origin/gh/desertfire/606/head 2025-12-04T11:11:09.6165127Z * [new branch] gh/desertfire/606/orig -> origin/gh/desertfire/606/orig 2025-12-04T11:11:09.6165205Z * [new branch] gh/desertfire/607/base -> origin/gh/desertfire/607/base 2025-12-04T11:11:09.6165282Z * [new branch] gh/desertfire/607/head -> origin/gh/desertfire/607/head 2025-12-04T11:11:09.6165362Z * [new branch] gh/desertfire/607/orig -> origin/gh/desertfire/607/orig 2025-12-04T11:11:09.6165439Z * [new branch] gh/desertfire/608/base -> origin/gh/desertfire/608/base 2025-12-04T11:11:09.6165516Z * [new branch] gh/desertfire/608/head -> origin/gh/desertfire/608/head 2025-12-04T11:11:09.6165598Z * [new branch] gh/desertfire/608/orig -> origin/gh/desertfire/608/orig 2025-12-04T11:11:09.6165676Z * [new branch] gh/desertfire/609/base -> origin/gh/desertfire/609/base 2025-12-04T11:11:09.6165752Z * [new branch] gh/desertfire/609/head -> origin/gh/desertfire/609/head 2025-12-04T11:11:09.6165835Z * [new branch] gh/desertfire/609/orig -> origin/gh/desertfire/609/orig 2025-12-04T11:11:09.6165913Z * [new branch] gh/desertfire/610/base -> origin/gh/desertfire/610/base 2025-12-04T11:11:09.6165990Z * [new branch] gh/desertfire/610/head -> origin/gh/desertfire/610/head 2025-12-04T11:11:09.6166069Z * [new branch] gh/desertfire/610/orig -> origin/gh/desertfire/610/orig 2025-12-04T11:11:09.6166147Z * [new branch] gh/desertfire/611/base -> origin/gh/desertfire/611/base 2025-12-04T11:11:09.6166224Z * [new branch] gh/desertfire/611/head -> origin/gh/desertfire/611/head 2025-12-04T11:11:09.6166306Z * [new branch] gh/desertfire/611/orig -> origin/gh/desertfire/611/orig 2025-12-04T11:11:09.6166402Z * [new branch] gh/desertfire/612/base -> origin/gh/desertfire/612/base 2025-12-04T11:11:09.6166481Z * [new branch] gh/desertfire/612/head -> origin/gh/desertfire/612/head 2025-12-04T11:11:09.6166564Z * [new branch] gh/desertfire/612/orig -> origin/gh/desertfire/612/orig 2025-12-04T11:11:09.6166668Z * [new branch] gh/desertfire/613/base -> origin/gh/desertfire/613/base 2025-12-04T11:11:09.6166745Z * [new branch] gh/desertfire/613/head -> origin/gh/desertfire/613/head 2025-12-04T11:11:09.6166827Z * [new branch] gh/desertfire/613/orig -> origin/gh/desertfire/613/orig 2025-12-04T11:11:09.6166904Z * [new branch] gh/desertfire/614/base -> origin/gh/desertfire/614/base 2025-12-04T11:11:09.6166986Z * [new branch] gh/desertfire/614/head -> origin/gh/desertfire/614/head 2025-12-04T11:11:09.6167063Z * [new branch] gh/desertfire/614/orig -> origin/gh/desertfire/614/orig 2025-12-04T11:11:09.6167141Z * [new branch] gh/desertfire/615/base -> origin/gh/desertfire/615/base 2025-12-04T11:11:09.6167224Z * [new branch] gh/desertfire/615/head -> origin/gh/desertfire/615/head 2025-12-04T11:11:09.6167303Z * [new branch] gh/desertfire/615/orig -> origin/gh/desertfire/615/orig 2025-12-04T11:11:09.6167378Z * [new branch] gh/desertfire/616/base -> origin/gh/desertfire/616/base 2025-12-04T11:11:09.6167459Z * [new branch] gh/desertfire/616/head -> origin/gh/desertfire/616/head 2025-12-04T11:11:09.6167534Z * [new branch] gh/desertfire/616/orig -> origin/gh/desertfire/616/orig 2025-12-04T11:11:09.6167610Z * [new branch] gh/desertfire/617/base -> origin/gh/desertfire/617/base 2025-12-04T11:11:09.6167685Z * [new branch] gh/desertfire/617/head -> origin/gh/desertfire/617/head 2025-12-04T11:11:09.6167763Z * [new branch] gh/desertfire/617/orig -> origin/gh/desertfire/617/orig 2025-12-04T11:11:09.6167838Z * [new branch] gh/dharakk/1/base -> origin/gh/dharakk/1/base 2025-12-04T11:11:09.6167911Z * [new branch] gh/dharakk/1/head -> origin/gh/dharakk/1/head 2025-12-04T11:11:09.6167993Z * [new branch] gh/drisspg/170/base -> origin/gh/drisspg/170/base 2025-12-04T11:11:09.6168067Z * [new branch] gh/drisspg/170/head -> origin/gh/drisspg/170/head 2025-12-04T11:11:09.6168139Z * [new branch] gh/drisspg/170/orig -> origin/gh/drisspg/170/orig 2025-12-04T11:11:09.6168270Z * [new branch] gh/drisspg/182/base -> origin/gh/drisspg/182/base 2025-12-04T11:11:09.6168342Z * [new branch] gh/drisspg/182/head -> origin/gh/drisspg/182/head 2025-12-04T11:11:09.6168417Z * [new branch] gh/drisspg/183/base -> origin/gh/drisspg/183/base 2025-12-04T11:11:09.6168488Z * [new branch] gh/drisspg/183/head -> origin/gh/drisspg/183/head 2025-12-04T11:11:09.6168557Z * [new branch] gh/drisspg/184/base -> origin/gh/drisspg/184/base 2025-12-04T11:11:09.6168633Z * [new branch] gh/drisspg/184/head -> origin/gh/drisspg/184/head 2025-12-04T11:11:09.6168706Z * [new branch] gh/drisspg/185/base -> origin/gh/drisspg/185/base 2025-12-04T11:11:09.6168778Z * [new branch] gh/drisspg/185/head -> origin/gh/drisspg/185/head 2025-12-04T11:11:09.6168855Z * [new branch] gh/drisspg/194/base -> origin/gh/drisspg/194/base 2025-12-04T11:11:09.6168926Z * [new branch] gh/drisspg/194/head -> origin/gh/drisspg/194/head 2025-12-04T11:11:09.6168998Z * [new branch] gh/drisspg/194/orig -> origin/gh/drisspg/194/orig 2025-12-04T11:11:09.6169072Z * [new branch] gh/drisspg/200/base -> origin/gh/drisspg/200/base 2025-12-04T11:11:09.6169145Z * [new branch] gh/drisspg/200/head -> origin/gh/drisspg/200/head 2025-12-04T11:11:09.6169248Z * [new branch] gh/drisspg/200/orig -> origin/gh/drisspg/200/orig 2025-12-04T11:11:09.6169325Z * [new branch] gh/drisspg/218/base -> origin/gh/drisspg/218/base 2025-12-04T11:11:09.6169423Z * [new branch] gh/drisspg/218/head -> origin/gh/drisspg/218/head 2025-12-04T11:11:09.6169495Z * [new branch] gh/drisspg/218/orig -> origin/gh/drisspg/218/orig 2025-12-04T11:11:09.6169570Z * [new branch] gh/drisspg/219/base -> origin/gh/drisspg/219/base 2025-12-04T11:11:09.6169643Z * [new branch] gh/drisspg/219/head -> origin/gh/drisspg/219/head 2025-12-04T11:11:09.6169715Z * [new branch] gh/drisspg/219/orig -> origin/gh/drisspg/219/orig 2025-12-04T11:11:09.6169786Z * [new branch] gh/drisspg/220/base -> origin/gh/drisspg/220/base 2025-12-04T11:11:09.6169858Z * [new branch] gh/drisspg/220/head -> origin/gh/drisspg/220/head 2025-12-04T11:11:09.6169931Z * [new branch] gh/drisspg/220/orig -> origin/gh/drisspg/220/orig 2025-12-04T11:11:09.6170008Z * [new branch] gh/drisspg/221/base -> origin/gh/drisspg/221/base 2025-12-04T11:11:09.6170083Z * [new branch] gh/drisspg/221/head -> origin/gh/drisspg/221/head 2025-12-04T11:11:09.6170159Z * [new branch] gh/drisspg/221/orig -> origin/gh/drisspg/221/orig 2025-12-04T11:11:09.6170233Z * [new branch] gh/drisspg/222/base -> origin/gh/drisspg/222/base 2025-12-04T11:11:09.6170304Z * [new branch] gh/drisspg/222/head -> origin/gh/drisspg/222/head 2025-12-04T11:11:09.6170380Z * [new branch] gh/drisspg/222/orig -> origin/gh/drisspg/222/orig 2025-12-04T11:11:09.6170451Z * [new branch] gh/drisspg/223/base -> origin/gh/drisspg/223/base 2025-12-04T11:11:09.6170523Z * [new branch] gh/drisspg/223/head -> origin/gh/drisspg/223/head 2025-12-04T11:11:09.6170600Z * [new branch] gh/drisspg/223/orig -> origin/gh/drisspg/223/orig 2025-12-04T11:11:09.6170674Z * [new branch] gh/drisspg/224/base -> origin/gh/drisspg/224/base 2025-12-04T11:11:09.6170750Z * [new branch] gh/drisspg/224/head -> origin/gh/drisspg/224/head 2025-12-04T11:11:09.6170826Z * [new branch] gh/drisspg/224/orig -> origin/gh/drisspg/224/orig 2025-12-04T11:11:09.6170897Z * [new branch] gh/drisspg/225/base -> origin/gh/drisspg/225/base 2025-12-04T11:11:09.6170969Z * [new branch] gh/drisspg/225/head -> origin/gh/drisspg/225/head 2025-12-04T11:11:09.6171044Z * [new branch] gh/drisspg/225/orig -> origin/gh/drisspg/225/orig 2025-12-04T11:11:09.6171113Z * [new branch] gh/drisspg/226/base -> origin/gh/drisspg/226/base 2025-12-04T11:11:09.6171183Z * [new branch] gh/drisspg/226/head -> origin/gh/drisspg/226/head 2025-12-04T11:11:09.6171260Z * [new branch] gh/drisspg/226/orig -> origin/gh/drisspg/226/orig 2025-12-04T11:11:09.6171331Z * [new branch] gh/drisspg/227/base -> origin/gh/drisspg/227/base 2025-12-04T11:11:09.6171406Z * [new branch] gh/drisspg/227/head -> origin/gh/drisspg/227/head 2025-12-04T11:11:09.6171482Z * [new branch] gh/drisspg/227/orig -> origin/gh/drisspg/227/orig 2025-12-04T11:11:09.6171554Z * [new branch] gh/drisspg/228/base -> origin/gh/drisspg/228/base 2025-12-04T11:11:09.6171626Z * [new branch] gh/drisspg/228/head -> origin/gh/drisspg/228/head 2025-12-04T11:11:09.6171704Z * [new branch] gh/drisspg/228/orig -> origin/gh/drisspg/228/orig 2025-12-04T11:11:09.6171774Z * [new branch] gh/drisspg/229/base -> origin/gh/drisspg/229/base 2025-12-04T11:11:09.6171849Z * [new branch] gh/drisspg/229/head -> origin/gh/drisspg/229/head 2025-12-04T11:11:09.6171953Z * [new branch] gh/drisspg/229/orig -> origin/gh/drisspg/229/orig 2025-12-04T11:11:09.6172025Z * [new branch] gh/drisspg/230/base -> origin/gh/drisspg/230/base 2025-12-04T11:11:09.6172123Z * [new branch] gh/drisspg/230/head -> origin/gh/drisspg/230/head 2025-12-04T11:11:09.6172194Z * [new branch] gh/drisspg/230/orig -> origin/gh/drisspg/230/orig 2025-12-04T11:11:09.6172271Z * [new branch] gh/dsjohns2/1/base -> origin/gh/dsjohns2/1/base 2025-12-04T11:11:09.6172350Z * [new branch] gh/dsjohns2/1/head -> origin/gh/dsjohns2/1/head 2025-12-04T11:11:09.6172432Z * [new branch] gh/dzmitry-huba/1/base -> origin/gh/dzmitry-huba/1/base 2025-12-04T11:11:09.6172510Z * [new branch] gh/dzmitry-huba/1/head -> origin/gh/dzmitry-huba/1/head 2025-12-04T11:11:09.6172599Z * [new branch] gh/dzmitry-huba/12/base -> origin/gh/dzmitry-huba/12/base 2025-12-04T11:11:09.6172678Z * [new branch] gh/dzmitry-huba/12/head -> origin/gh/dzmitry-huba/12/head 2025-12-04T11:11:09.6172756Z * [new branch] gh/dzmitry-huba/12/orig -> origin/gh/dzmitry-huba/12/orig 2025-12-04T11:11:09.6172840Z * [new branch] gh/dzmitry-huba/13/base -> origin/gh/dzmitry-huba/13/base 2025-12-04T11:11:09.6172917Z * [new branch] gh/dzmitry-huba/13/head -> origin/gh/dzmitry-huba/13/head 2025-12-04T11:11:09.6172994Z * [new branch] gh/dzmitry-huba/13/orig -> origin/gh/dzmitry-huba/13/orig 2025-12-04T11:11:09.6173076Z * [new branch] gh/dzmitry-huba/14/base -> origin/gh/dzmitry-huba/14/base 2025-12-04T11:11:09.6173153Z * [new branch] gh/dzmitry-huba/14/head -> origin/gh/dzmitry-huba/14/head 2025-12-04T11:11:09.6173231Z * [new branch] gh/dzmitry-huba/14/orig -> origin/gh/dzmitry-huba/14/orig 2025-12-04T11:11:09.6173314Z * [new branch] gh/dzmitry-huba/15/base -> origin/gh/dzmitry-huba/15/base 2025-12-04T11:11:09.6173391Z * [new branch] gh/dzmitry-huba/15/head -> origin/gh/dzmitry-huba/15/head 2025-12-04T11:11:09.6173469Z * [new branch] gh/dzmitry-huba/15/orig -> origin/gh/dzmitry-huba/15/orig 2025-12-04T11:11:09.6173550Z * [new branch] gh/dzmitry-huba/16/base -> origin/gh/dzmitry-huba/16/base 2025-12-04T11:11:09.6173627Z * [new branch] gh/dzmitry-huba/16/head -> origin/gh/dzmitry-huba/16/head 2025-12-04T11:11:09.6173709Z * [new branch] gh/dzmitry-huba/16/orig -> origin/gh/dzmitry-huba/16/orig 2025-12-04T11:11:09.6173787Z * [new branch] gh/dzmitry-huba/17/base -> origin/gh/dzmitry-huba/17/base 2025-12-04T11:11:09.6173864Z * [new branch] gh/dzmitry-huba/17/head -> origin/gh/dzmitry-huba/17/head 2025-12-04T11:11:09.6173946Z * [new branch] gh/dzmitry-huba/17/orig -> origin/gh/dzmitry-huba/17/orig 2025-12-04T11:11:09.6174027Z * [new branch] gh/dzmitry-huba/2/base -> origin/gh/dzmitry-huba/2/base 2025-12-04T11:11:09.6174103Z * [new branch] gh/dzmitry-huba/2/head -> origin/gh/dzmitry-huba/2/head 2025-12-04T11:11:09.6174186Z * [new branch] gh/dzmitry-huba/3/base -> origin/gh/dzmitry-huba/3/base 2025-12-04T11:11:09.6174262Z * [new branch] gh/dzmitry-huba/3/head -> origin/gh/dzmitry-huba/3/head 2025-12-04T11:11:09.6174342Z * [new branch] gh/eellison/808/base -> origin/gh/eellison/808/base 2025-12-04T11:11:09.6174422Z * [new branch] gh/eellison/808/head -> origin/gh/eellison/808/head 2025-12-04T11:11:09.6174497Z * [new branch] gh/eellison/808/orig -> origin/gh/eellison/808/orig 2025-12-04T11:11:09.6174572Z * [new branch] gh/eellison/822/base -> origin/gh/eellison/822/base 2025-12-04T11:11:09.6174649Z * [new branch] gh/eellison/822/head -> origin/gh/eellison/822/head 2025-12-04T11:11:09.6174745Z * [new branch] gh/eellison/822/orig -> origin/gh/eellison/822/orig 2025-12-04T11:11:09.6174818Z * [new branch] gh/eellison/823/base -> origin/gh/eellison/823/base 2025-12-04T11:11:09.6175135Z * [new branch] gh/eellison/823/head -> origin/gh/eellison/823/head 2025-12-04T11:11:09.6175207Z * [new branch] gh/eellison/823/orig -> origin/gh/eellison/823/orig 2025-12-04T11:11:09.6175280Z * [new branch] gh/eellison/862/base -> origin/gh/eellison/862/base 2025-12-04T11:11:09.6175356Z * [new branch] gh/eellison/862/head -> origin/gh/eellison/862/head 2025-12-04T11:11:09.6175428Z * [new branch] gh/eellison/862/orig -> origin/gh/eellison/862/orig 2025-12-04T11:11:09.6175504Z * [new branch] gh/eellison/863/base -> origin/gh/eellison/863/base 2025-12-04T11:11:09.6175576Z * [new branch] gh/eellison/863/head -> origin/gh/eellison/863/head 2025-12-04T11:11:09.6175649Z * [new branch] gh/eellison/863/orig -> origin/gh/eellison/863/orig 2025-12-04T11:11:09.6175725Z * [new branch] gh/eellison/864/base -> origin/gh/eellison/864/base 2025-12-04T11:11:09.6175800Z * [new branch] gh/eellison/864/head -> origin/gh/eellison/864/head 2025-12-04T11:11:09.6175872Z * [new branch] gh/eellison/864/orig -> origin/gh/eellison/864/orig 2025-12-04T11:11:09.6175948Z * [new branch] gh/eellison/865/base -> origin/gh/eellison/865/base 2025-12-04T11:11:09.6176021Z * [new branch] gh/eellison/865/head -> origin/gh/eellison/865/head 2025-12-04T11:11:09.6176093Z * [new branch] gh/eellison/865/orig -> origin/gh/eellison/865/orig 2025-12-04T11:11:09.6176170Z * [new branch] gh/eellison/866/base -> origin/gh/eellison/866/base 2025-12-04T11:11:09.6176244Z * [new branch] gh/eellison/866/head -> origin/gh/eellison/866/head 2025-12-04T11:11:09.6176317Z * [new branch] gh/eellison/866/orig -> origin/gh/eellison/866/orig 2025-12-04T11:11:09.6176394Z * [new branch] gh/eellison/867/base -> origin/gh/eellison/867/base 2025-12-04T11:11:09.6176469Z * [new branch] gh/eellison/867/head -> origin/gh/eellison/867/head 2025-12-04T11:11:09.6176538Z * [new branch] gh/eellison/867/orig -> origin/gh/eellison/867/orig 2025-12-04T11:11:09.6176613Z * [new branch] gh/eellison/868/base -> origin/gh/eellison/868/base 2025-12-04T11:11:09.6176686Z * [new branch] gh/eellison/868/head -> origin/gh/eellison/868/head 2025-12-04T11:11:09.6176759Z * [new branch] gh/eellison/868/orig -> origin/gh/eellison/868/orig 2025-12-04T11:11:09.6176836Z * [new branch] gh/eellison/869/base -> origin/gh/eellison/869/base 2025-12-04T11:11:09.6176910Z * [new branch] gh/eellison/869/head -> origin/gh/eellison/869/head 2025-12-04T11:11:09.6176987Z * [new branch] gh/eellison/869/orig -> origin/gh/eellison/869/orig 2025-12-04T11:11:09.6177059Z * [new branch] gh/eellison/870/base -> origin/gh/eellison/870/base 2025-12-04T11:11:09.6177135Z * [new branch] gh/eellison/870/head -> origin/gh/eellison/870/head 2025-12-04T11:11:09.6177212Z * [new branch] gh/eellison/870/orig -> origin/gh/eellison/870/orig 2025-12-04T11:11:09.6177285Z * [new branch] gh/eellison/871/base -> origin/gh/eellison/871/base 2025-12-04T11:11:09.6177359Z * [new branch] gh/eellison/871/head -> origin/gh/eellison/871/head 2025-12-04T11:11:09.6177436Z * [new branch] gh/eellison/871/orig -> origin/gh/eellison/871/orig 2025-12-04T11:11:09.6177509Z * [new branch] gh/eellison/872/base -> origin/gh/eellison/872/base 2025-12-04T11:11:09.6177605Z * [new branch] gh/eellison/872/head -> origin/gh/eellison/872/head 2025-12-04T11:11:09.6177682Z * [new branch] gh/eellison/872/orig -> origin/gh/eellison/872/orig 2025-12-04T11:11:09.6177756Z * [new branch] gh/eellison/873/base -> origin/gh/eellison/873/base 2025-12-04T11:11:09.6177849Z * [new branch] gh/eellison/873/head -> origin/gh/eellison/873/head 2025-12-04T11:11:09.6177926Z * [new branch] gh/eellison/873/orig -> origin/gh/eellison/873/orig 2025-12-04T11:11:09.6177996Z * [new branch] gh/eellison/874/base -> origin/gh/eellison/874/base 2025-12-04T11:11:09.6178068Z * [new branch] gh/eellison/874/head -> origin/gh/eellison/874/head 2025-12-04T11:11:09.6178181Z * [new branch] gh/eellison/874/orig -> origin/gh/eellison/874/orig 2025-12-04T11:11:09.6178255Z * [new branch] gh/eellison/875/base -> origin/gh/eellison/875/base 2025-12-04T11:11:09.6178328Z * [new branch] gh/eellison/875/head -> origin/gh/eellison/875/head 2025-12-04T11:11:09.6178406Z * [new branch] gh/eellison/875/orig -> origin/gh/eellison/875/orig 2025-12-04T11:11:09.6178479Z * [new branch] gh/eellison/876/base -> origin/gh/eellison/876/base 2025-12-04T11:11:09.6178557Z * [new branch] gh/eellison/876/head -> origin/gh/eellison/876/head 2025-12-04T11:11:09.6178636Z * [new branch] gh/eellison/876/orig -> origin/gh/eellison/876/orig 2025-12-04T11:11:09.6178709Z * [new branch] gh/eellison/877/base -> origin/gh/eellison/877/base 2025-12-04T11:11:09.6178788Z * [new branch] gh/eellison/877/head -> origin/gh/eellison/877/head 2025-12-04T11:11:09.6178863Z * [new branch] gh/eellison/877/orig -> origin/gh/eellison/877/orig 2025-12-04T11:11:09.6178937Z * [new branch] gh/eellison/878/base -> origin/gh/eellison/878/base 2025-12-04T11:11:09.6179017Z * [new branch] gh/eellison/878/head -> origin/gh/eellison/878/head 2025-12-04T11:11:09.6179090Z * [new branch] gh/eellison/878/orig -> origin/gh/eellison/878/orig 2025-12-04T11:11:09.6179163Z * [new branch] gh/eellison/879/base -> origin/gh/eellison/879/base 2025-12-04T11:11:09.6179244Z * [new branch] gh/eellison/879/head -> origin/gh/eellison/879/head 2025-12-04T11:11:09.6179316Z * [new branch] gh/eellison/879/orig -> origin/gh/eellison/879/orig 2025-12-04T11:11:09.6179386Z * [new branch] gh/eellison/880/base -> origin/gh/eellison/880/base 2025-12-04T11:11:09.6179463Z * [new branch] gh/eellison/880/head -> origin/gh/eellison/880/head 2025-12-04T11:11:09.6179536Z * [new branch] gh/eellison/880/orig -> origin/gh/eellison/880/orig 2025-12-04T11:11:09.6179610Z * [new branch] gh/eellison/881/base -> origin/gh/eellison/881/base 2025-12-04T11:11:09.6179688Z * [new branch] gh/eellison/881/head -> origin/gh/eellison/881/head 2025-12-04T11:11:09.6179761Z * [new branch] gh/eellison/881/orig -> origin/gh/eellison/881/orig 2025-12-04T11:11:09.6179835Z * [new branch] gh/eellison/882/base -> origin/gh/eellison/882/base 2025-12-04T11:11:09.6179913Z * [new branch] gh/eellison/882/head -> origin/gh/eellison/882/head 2025-12-04T11:11:09.6179986Z * [new branch] gh/eellison/882/orig -> origin/gh/eellison/882/orig 2025-12-04T11:11:09.6180059Z * [new branch] gh/eellison/883/base -> origin/gh/eellison/883/base 2025-12-04T11:11:09.6180136Z * [new branch] gh/eellison/883/head -> origin/gh/eellison/883/head 2025-12-04T11:11:09.6180209Z * [new branch] gh/eellison/883/orig -> origin/gh/eellison/883/orig 2025-12-04T11:11:09.6180286Z * [new branch] gh/eellison/884/base -> origin/gh/eellison/884/base 2025-12-04T11:11:09.6180388Z * [new branch] gh/eellison/884/head -> origin/gh/eellison/884/head 2025-12-04T11:11:09.6180462Z * [new branch] gh/eellison/884/orig -> origin/gh/eellison/884/orig 2025-12-04T11:11:09.6180558Z * [new branch] gh/etaf/147/base -> origin/gh/etaf/147/base 2025-12-04T11:11:09.6180626Z * [new branch] gh/etaf/147/head -> origin/gh/etaf/147/head 2025-12-04T11:11:09.6180694Z * [new branch] gh/etaf/154/base -> origin/gh/etaf/154/base 2025-12-04T11:11:09.6180767Z * [new branch] gh/etaf/154/head -> origin/gh/etaf/154/head 2025-12-04T11:11:09.6180835Z * [new branch] gh/etaf/154/orig -> origin/gh/etaf/154/orig 2025-12-04T11:11:09.6180901Z * [new branch] gh/etaf/156/base -> origin/gh/etaf/156/base 2025-12-04T11:11:09.6180972Z * [new branch] gh/etaf/156/head -> origin/gh/etaf/156/head 2025-12-04T11:11:09.6181042Z * [new branch] gh/etaf/156/orig -> origin/gh/etaf/156/orig 2025-12-04T11:11:09.6181108Z * [new branch] gh/etaf/157/base -> origin/gh/etaf/157/base 2025-12-04T11:11:09.6181181Z * [new branch] gh/etaf/157/head -> origin/gh/etaf/157/head 2025-12-04T11:11:09.6181249Z * [new branch] gh/etaf/157/orig -> origin/gh/etaf/157/orig 2025-12-04T11:11:09.6181317Z * [new branch] gh/etaf/158/base -> origin/gh/etaf/158/base 2025-12-04T11:11:09.6181383Z * [new branch] gh/etaf/158/head -> origin/gh/etaf/158/head 2025-12-04T11:11:09.6181448Z * [new branch] gh/etaf/158/orig -> origin/gh/etaf/158/orig 2025-12-04T11:11:09.6181515Z * [new branch] gh/etaf/159/base -> origin/gh/etaf/159/base 2025-12-04T11:11:09.6181582Z * [new branch] gh/etaf/159/head -> origin/gh/etaf/159/head 2025-12-04T11:11:09.6181650Z * [new branch] gh/etaf/159/orig -> origin/gh/etaf/159/orig 2025-12-04T11:11:09.6181720Z * [new branch] gh/etaf/160/base -> origin/gh/etaf/160/base 2025-12-04T11:11:09.6181784Z * [new branch] gh/etaf/160/head -> origin/gh/etaf/160/head 2025-12-04T11:11:09.6181850Z * [new branch] gh/etaf/160/orig -> origin/gh/etaf/160/orig 2025-12-04T11:11:09.6181917Z * [new branch] gh/etaf/161/base -> origin/gh/etaf/161/base 2025-12-04T11:11:09.6181983Z * [new branch] gh/etaf/161/head -> origin/gh/etaf/161/head 2025-12-04T11:11:09.6182050Z * [new branch] gh/etaf/161/orig -> origin/gh/etaf/161/orig 2025-12-04T11:11:09.6182119Z * [new branch] gh/etaf/166/base -> origin/gh/etaf/166/base 2025-12-04T11:11:09.6182185Z * [new branch] gh/etaf/166/head -> origin/gh/etaf/166/head 2025-12-04T11:11:09.6182252Z * [new branch] gh/etaf/166/orig -> origin/gh/etaf/166/orig 2025-12-04T11:11:09.6182319Z * [new branch] gh/etaf/167/base -> origin/gh/etaf/167/base 2025-12-04T11:11:09.6182384Z * [new branch] gh/etaf/167/head -> origin/gh/etaf/167/head 2025-12-04T11:11:09.6182452Z * [new branch] gh/etaf/167/orig -> origin/gh/etaf/167/orig 2025-12-04T11:11:09.6182517Z * [new branch] gh/etaf/168/base -> origin/gh/etaf/168/base 2025-12-04T11:11:09.6182583Z * [new branch] gh/etaf/168/head -> origin/gh/etaf/168/head 2025-12-04T11:11:09.6182652Z * [new branch] gh/etaf/168/orig -> origin/gh/etaf/168/orig 2025-12-04T11:11:09.6182717Z * [new branch] gh/etaf/172/base -> origin/gh/etaf/172/base 2025-12-04T11:11:09.6182781Z * [new branch] gh/etaf/172/head -> origin/gh/etaf/172/head 2025-12-04T11:11:09.6182849Z * [new branch] gh/etaf/172/orig -> origin/gh/etaf/172/orig 2025-12-04T11:11:09.6182943Z * [new branch] gh/etaf/173/base -> origin/gh/etaf/173/base 2025-12-04T11:11:09.6183010Z * [new branch] gh/etaf/173/head -> origin/gh/etaf/173/head 2025-12-04T11:11:09.6183097Z * [new branch] gh/etaf/173/orig -> origin/gh/etaf/173/orig 2025-12-04T11:11:09.6183162Z * [new branch] gh/etaf/174/base -> origin/gh/etaf/174/base 2025-12-04T11:11:09.6183227Z * [new branch] gh/etaf/174/head -> origin/gh/etaf/174/head 2025-12-04T11:11:09.6183295Z * [new branch] gh/etaf/175/base -> origin/gh/etaf/175/base 2025-12-04T11:11:09.6183361Z * [new branch] gh/etaf/175/head -> origin/gh/etaf/175/head 2025-12-04T11:11:09.6183425Z * [new branch] gh/etaf/175/orig -> origin/gh/etaf/175/orig 2025-12-04T11:11:09.6183491Z * [new branch] gh/etaf/176/base -> origin/gh/etaf/176/base 2025-12-04T11:11:09.6183557Z * [new branch] gh/etaf/176/head -> origin/gh/etaf/176/head 2025-12-04T11:11:09.6183626Z * [new branch] gh/etaf/176/orig -> origin/gh/etaf/176/orig 2025-12-04T11:11:09.6183689Z * [new branch] gh/etaf/177/base -> origin/gh/etaf/177/base 2025-12-04T11:11:09.6183754Z * [new branch] gh/etaf/177/head -> origin/gh/etaf/177/head 2025-12-04T11:11:09.6183820Z * [new branch] gh/etaf/177/orig -> origin/gh/etaf/177/orig 2025-12-04T11:11:09.6183883Z * [new branch] gh/etaf/178/base -> origin/gh/etaf/178/base 2025-12-04T11:11:09.6183948Z * [new branch] gh/etaf/178/head -> origin/gh/etaf/178/head 2025-12-04T11:11:09.6184013Z * [new branch] gh/etaf/178/orig -> origin/gh/etaf/178/orig 2025-12-04T11:11:09.6184077Z * [new branch] gh/etaf/179/base -> origin/gh/etaf/179/base 2025-12-04T11:11:09.6184143Z * [new branch] gh/etaf/179/head -> origin/gh/etaf/179/head 2025-12-04T11:11:09.6184212Z * [new branch] gh/etaf/179/orig -> origin/gh/etaf/179/orig 2025-12-04T11:11:09.6184276Z * [new branch] gh/etaf/180/base -> origin/gh/etaf/180/base 2025-12-04T11:11:09.6184344Z * [new branch] gh/etaf/180/head -> origin/gh/etaf/180/head 2025-12-04T11:11:09.6184411Z * [new branch] gh/etaf/180/orig -> origin/gh/etaf/180/orig 2025-12-04T11:11:09.6184491Z * [new branch] gh/exclamaforte/1/base -> origin/gh/exclamaforte/1/base 2025-12-04T11:11:09.6184568Z * [new branch] gh/exclamaforte/1/head -> origin/gh/exclamaforte/1/head 2025-12-04T11:11:09.6184646Z * [new branch] gh/exclamaforte/2/base -> origin/gh/exclamaforte/2/base 2025-12-04T11:11:09.6184722Z * [new branch] gh/exclamaforte/2/head -> origin/gh/exclamaforte/2/head 2025-12-04T11:11:09.6184800Z * [new branch] gh/exclamaforte/3/base -> origin/gh/exclamaforte/3/base 2025-12-04T11:11:09.6184881Z * [new branch] gh/exclamaforte/3/head -> origin/gh/exclamaforte/3/head 2025-12-04T11:11:09.6184955Z * [new branch] gh/exclamaforte/4/base -> origin/gh/exclamaforte/4/base 2025-12-04T11:11:09.6185033Z * [new branch] gh/exclamaforte/4/head -> origin/gh/exclamaforte/4/head 2025-12-04T11:11:09.6185106Z * [new branch] gh/ezyang/2374/base -> origin/gh/ezyang/2374/base 2025-12-04T11:11:09.6185178Z * [new branch] gh/ezyang/2374/head -> origin/gh/ezyang/2374/head 2025-12-04T11:11:09.6185252Z * [new branch] gh/ezyang/2374/orig -> origin/gh/ezyang/2374/orig 2025-12-04T11:11:09.6185321Z * [new branch] gh/ezyang/2973/base -> origin/gh/ezyang/2973/base 2025-12-04T11:11:09.6185390Z * [new branch] gh/ezyang/2973/head -> origin/gh/ezyang/2973/head 2025-12-04T11:11:09.6185482Z * [new branch] gh/ezyang/2973/orig -> origin/gh/ezyang/2973/orig 2025-12-04T11:11:09.6185552Z * [new branch] gh/ezyang/2974/base -> origin/gh/ezyang/2974/base 2025-12-04T11:11:09.6185623Z * [new branch] gh/ezyang/2974/head -> origin/gh/ezyang/2974/head 2025-12-04T11:11:09.6185715Z * [new branch] gh/ezyang/2974/orig -> origin/gh/ezyang/2974/orig 2025-12-04T11:11:09.6185786Z * [new branch] gh/ezyang/3131/base -> origin/gh/ezyang/3131/base 2025-12-04T11:11:09.6185855Z * [new branch] gh/ezyang/3131/head -> origin/gh/ezyang/3131/head 2025-12-04T11:11:09.6185926Z * [new branch] gh/ezyang/3131/orig -> origin/gh/ezyang/3131/orig 2025-12-04T11:11:09.6185995Z * [new branch] gh/ezyang/3139/base -> origin/gh/ezyang/3139/base 2025-12-04T11:11:09.6186063Z * [new branch] gh/ezyang/3139/head -> origin/gh/ezyang/3139/head 2025-12-04T11:11:09.6186134Z * [new branch] gh/ezyang/3139/orig -> origin/gh/ezyang/3139/orig 2025-12-04T11:11:09.6186204Z * [new branch] gh/ezyang/3140/base -> origin/gh/ezyang/3140/base 2025-12-04T11:11:09.6186272Z * [new branch] gh/ezyang/3140/head -> origin/gh/ezyang/3140/head 2025-12-04T11:11:09.6186345Z * [new branch] gh/ezyang/3140/orig -> origin/gh/ezyang/3140/orig 2025-12-04T11:11:09.6186413Z * [new branch] gh/ezyang/3143/base -> origin/gh/ezyang/3143/base 2025-12-04T11:11:09.6186482Z * [new branch] gh/ezyang/3143/head -> origin/gh/ezyang/3143/head 2025-12-04T11:11:09.6186552Z * [new branch] gh/ezyang/3143/orig -> origin/gh/ezyang/3143/orig 2025-12-04T11:11:09.6186621Z * [new branch] gh/ezyang/3144/base -> origin/gh/ezyang/3144/base 2025-12-04T11:11:09.6186689Z * [new branch] gh/ezyang/3144/head -> origin/gh/ezyang/3144/head 2025-12-04T11:11:09.6186761Z * [new branch] gh/ezyang/3144/orig -> origin/gh/ezyang/3144/orig 2025-12-04T11:11:09.6186830Z * [new branch] gh/ezyang/3167/base -> origin/gh/ezyang/3167/base 2025-12-04T11:11:09.6186899Z * [new branch] gh/ezyang/3167/head -> origin/gh/ezyang/3167/head 2025-12-04T11:11:09.6186974Z * [new branch] gh/ezyang/3167/orig -> origin/gh/ezyang/3167/orig 2025-12-04T11:11:09.6187043Z * [new branch] gh/ezyang/3173/base -> origin/gh/ezyang/3173/base 2025-12-04T11:11:09.6187114Z * [new branch] gh/ezyang/3173/head -> origin/gh/ezyang/3173/head 2025-12-04T11:11:09.6187182Z * [new branch] gh/ezyang/3173/orig -> origin/gh/ezyang/3173/orig 2025-12-04T11:11:09.6187253Z * [new branch] gh/ezyang/3175/base -> origin/gh/ezyang/3175/base 2025-12-04T11:11:09.6187324Z * [new branch] gh/ezyang/3175/head -> origin/gh/ezyang/3175/head 2025-12-04T11:11:09.6187393Z * [new branch] gh/ezyang/3175/orig -> origin/gh/ezyang/3175/orig 2025-12-04T11:11:09.6187461Z * [new branch] gh/ezyang/3182/base -> origin/gh/ezyang/3182/base 2025-12-04T11:11:09.6187530Z * [new branch] gh/ezyang/3182/head -> origin/gh/ezyang/3182/head 2025-12-04T11:11:09.6187600Z * [new branch] gh/ezyang/3182/orig -> origin/gh/ezyang/3182/orig 2025-12-04T11:11:09.6187670Z * [new branch] gh/ezyang/3185/base -> origin/gh/ezyang/3185/base 2025-12-04T11:11:09.6187741Z * [new branch] gh/ezyang/3185/head -> origin/gh/ezyang/3185/head 2025-12-04T11:11:09.6187809Z * [new branch] gh/ezyang/3185/orig -> origin/gh/ezyang/3185/orig 2025-12-04T11:11:09.6187877Z * [new branch] gh/ezyang/3189/base -> origin/gh/ezyang/3189/base 2025-12-04T11:11:09.6187948Z * [new branch] gh/ezyang/3189/head -> origin/gh/ezyang/3189/head 2025-12-04T11:11:09.6188034Z * [new branch] gh/ezyang/3189/orig -> origin/gh/ezyang/3189/orig 2025-12-04T11:11:09.6188104Z * [new branch] gh/ezyang/3191/base -> origin/gh/ezyang/3191/base 2025-12-04T11:11:09.6188226Z * [new branch] gh/ezyang/3191/head -> origin/gh/ezyang/3191/head 2025-12-04T11:11:09.6188294Z * [new branch] gh/ezyang/3191/orig -> origin/gh/ezyang/3191/orig 2025-12-04T11:11:09.6188364Z * [new branch] gh/ezyang/3192/base -> origin/gh/ezyang/3192/base 2025-12-04T11:11:09.6188435Z * [new branch] gh/ezyang/3192/head -> origin/gh/ezyang/3192/head 2025-12-04T11:11:09.6188504Z * [new branch] gh/ezyang/3192/orig -> origin/gh/ezyang/3192/orig 2025-12-04T11:11:09.6188573Z * [new branch] gh/ezyang/3193/base -> origin/gh/ezyang/3193/base 2025-12-04T11:11:09.6188644Z * [new branch] gh/ezyang/3193/head -> origin/gh/ezyang/3193/head 2025-12-04T11:11:09.6188714Z * [new branch] gh/ezyang/3193/orig -> origin/gh/ezyang/3193/orig 2025-12-04T11:11:09.6188784Z * [new branch] gh/ezyang/3194/base -> origin/gh/ezyang/3194/base 2025-12-04T11:11:09.6188854Z * [new branch] gh/ezyang/3194/head -> origin/gh/ezyang/3194/head 2025-12-04T11:11:09.6188922Z * [new branch] gh/ezyang/3194/orig -> origin/gh/ezyang/3194/orig 2025-12-04T11:11:09.6188993Z * [new branch] gh/ezyang/3195/base -> origin/gh/ezyang/3195/base 2025-12-04T11:11:09.6189061Z * [new branch] gh/ezyang/3195/head -> origin/gh/ezyang/3195/head 2025-12-04T11:11:09.6189130Z * [new branch] gh/ezyang/3195/orig -> origin/gh/ezyang/3195/orig 2025-12-04T11:11:09.6189201Z * [new branch] gh/ezyang/3196/base -> origin/gh/ezyang/3196/base 2025-12-04T11:11:09.6189270Z * [new branch] gh/ezyang/3196/head -> origin/gh/ezyang/3196/head 2025-12-04T11:11:09.6189342Z * [new branch] gh/ezyang/3196/orig -> origin/gh/ezyang/3196/orig 2025-12-04T11:11:09.6189414Z * [new branch] gh/ezyang/3197/base -> origin/gh/ezyang/3197/base 2025-12-04T11:11:09.6189484Z * [new branch] gh/ezyang/3197/head -> origin/gh/ezyang/3197/head 2025-12-04T11:11:09.6189552Z * [new branch] gh/ezyang/3197/orig -> origin/gh/ezyang/3197/orig 2025-12-04T11:11:09.6189622Z * [new branch] gh/ezyang/3198/base -> origin/gh/ezyang/3198/base 2025-12-04T11:11:09.6189690Z * [new branch] gh/ezyang/3198/head -> origin/gh/ezyang/3198/head 2025-12-04T11:11:09.6189758Z * [new branch] gh/ezyang/3198/orig -> origin/gh/ezyang/3198/orig 2025-12-04T11:11:09.6189829Z * [new branch] gh/ezyang/3199/base -> origin/gh/ezyang/3199/base 2025-12-04T11:11:09.6189897Z * [new branch] gh/ezyang/3199/head -> origin/gh/ezyang/3199/head 2025-12-04T11:11:09.6189967Z * [new branch] gh/ezyang/3199/orig -> origin/gh/ezyang/3199/orig 2025-12-04T11:11:09.6190038Z * [new branch] gh/ezyang/3200/base -> origin/gh/ezyang/3200/base 2025-12-04T11:11:09.6190108Z * [new branch] gh/ezyang/3200/head -> origin/gh/ezyang/3200/head 2025-12-04T11:11:09.6190176Z * [new branch] gh/ezyang/3200/orig -> origin/gh/ezyang/3200/orig 2025-12-04T11:11:09.6190246Z * [new branch] gh/ezyang/3201/base -> origin/gh/ezyang/3201/base 2025-12-04T11:11:09.6190314Z * [new branch] gh/ezyang/3201/head -> origin/gh/ezyang/3201/head 2025-12-04T11:11:09.6190385Z * [new branch] gh/ezyang/3201/orig -> origin/gh/ezyang/3201/orig 2025-12-04T11:11:09.6190453Z * [new branch] gh/ezyang/3202/base -> origin/gh/ezyang/3202/base 2025-12-04T11:11:09.6190521Z * [new branch] gh/ezyang/3202/head -> origin/gh/ezyang/3202/head 2025-12-04T11:11:09.6190623Z * [new branch] gh/ezyang/3202/orig -> origin/gh/ezyang/3202/orig 2025-12-04T11:11:09.6190693Z * [new branch] gh/ezyang/3203/base -> origin/gh/ezyang/3203/base 2025-12-04T11:11:09.6190792Z * [new branch] gh/ezyang/3203/head -> origin/gh/ezyang/3203/head 2025-12-04T11:11:09.6190863Z * [new branch] gh/ezyang/3203/orig -> origin/gh/ezyang/3203/orig 2025-12-04T11:11:09.6190932Z * [new branch] gh/ezyang/3204/base -> origin/gh/ezyang/3204/base 2025-12-04T11:11:09.6191001Z * [new branch] gh/ezyang/3204/head -> origin/gh/ezyang/3204/head 2025-12-04T11:11:09.6191071Z * [new branch] gh/ezyang/3204/orig -> origin/gh/ezyang/3204/orig 2025-12-04T11:11:09.6191139Z * [new branch] gh/ezyang/3205/base -> origin/gh/ezyang/3205/base 2025-12-04T11:11:09.6191206Z * [new branch] gh/ezyang/3205/head -> origin/gh/ezyang/3205/head 2025-12-04T11:11:09.6191278Z * [new branch] gh/ezyang/3205/orig -> origin/gh/ezyang/3205/orig 2025-12-04T11:11:09.6191348Z * [new branch] gh/ezyang/3206/base -> origin/gh/ezyang/3206/base 2025-12-04T11:11:09.6191417Z * [new branch] gh/ezyang/3206/head -> origin/gh/ezyang/3206/head 2025-12-04T11:11:09.6191488Z * [new branch] gh/ezyang/3206/orig -> origin/gh/ezyang/3206/orig 2025-12-04T11:11:09.6191556Z * [new branch] gh/ezyang/3207/base -> origin/gh/ezyang/3207/base 2025-12-04T11:11:09.6191626Z * [new branch] gh/ezyang/3207/head -> origin/gh/ezyang/3207/head 2025-12-04T11:11:09.6191698Z * [new branch] gh/ezyang/3207/orig -> origin/gh/ezyang/3207/orig 2025-12-04T11:11:09.6191766Z * [new branch] gh/ezyang/3208/base -> origin/gh/ezyang/3208/base 2025-12-04T11:11:09.6191835Z * [new branch] gh/ezyang/3208/head -> origin/gh/ezyang/3208/head 2025-12-04T11:11:09.6191908Z * [new branch] gh/ezyang/3208/orig -> origin/gh/ezyang/3208/orig 2025-12-04T11:11:09.6191976Z * [new branch] gh/ezyang/3209/base -> origin/gh/ezyang/3209/base 2025-12-04T11:11:09.6192049Z * [new branch] gh/ezyang/3209/head -> origin/gh/ezyang/3209/head 2025-12-04T11:11:09.6192116Z * [new branch] gh/ezyang/3209/orig -> origin/gh/ezyang/3209/orig 2025-12-04T11:11:09.6192187Z * [new branch] gh/fadara01/3/base -> origin/gh/fadara01/3/base 2025-12-04T11:11:09.6192259Z * [new branch] gh/fadara01/3/head -> origin/gh/fadara01/3/head 2025-12-04T11:11:09.6192329Z * [new branch] gh/fadara01/3/orig -> origin/gh/fadara01/3/orig 2025-12-04T11:11:09.6192397Z * [new branch] gh/fadara01/5/base -> origin/gh/fadara01/5/base 2025-12-04T11:11:09.6192470Z * [new branch] gh/fadara01/5/head -> origin/gh/fadara01/5/head 2025-12-04T11:11:09.6192539Z * [new branch] gh/fadara01/5/orig -> origin/gh/fadara01/5/orig 2025-12-04T11:11:09.6192607Z * [new branch] gh/fadara01/6/base -> origin/gh/fadara01/6/base 2025-12-04T11:11:09.6192679Z * [new branch] gh/fadara01/6/head -> origin/gh/fadara01/6/head 2025-12-04T11:11:09.6192747Z * [new branch] gh/fadara01/6/orig -> origin/gh/fadara01/6/orig 2025-12-04T11:11:09.6192815Z * [new branch] gh/fadara01/7/base -> origin/gh/fadara01/7/base 2025-12-04T11:11:09.6192886Z * [new branch] gh/fadara01/7/head -> origin/gh/fadara01/7/head 2025-12-04T11:11:09.6192953Z * [new branch] gh/fadara01/7/orig -> origin/gh/fadara01/7/orig 2025-12-04T11:11:09.6193021Z * [new branch] gh/fadara01/8/base -> origin/gh/fadara01/8/base 2025-12-04T11:11:09.6193091Z * [new branch] gh/fadara01/8/head -> origin/gh/fadara01/8/head 2025-12-04T11:11:09.6193181Z * [new branch] gh/fadara01/8/orig -> origin/gh/fadara01/8/orig 2025-12-04T11:11:09.6193250Z * [new branch] gh/fadara01/9/base -> origin/gh/fadara01/9/base 2025-12-04T11:11:09.6193342Z * [new branch] gh/fadara01/9/head -> origin/gh/fadara01/9/head 2025-12-04T11:11:09.6193412Z * [new branch] gh/fadara01/9/orig -> origin/gh/fadara01/9/orig 2025-12-04T11:11:09.6193481Z * [new branch] gh/fduwjj/182/base -> origin/gh/fduwjj/182/base 2025-12-04T11:11:09.6193551Z * [new branch] gh/fduwjj/182/head -> origin/gh/fduwjj/182/head 2025-12-04T11:11:09.6193618Z * [new branch] gh/fduwjj/182/orig -> origin/gh/fduwjj/182/orig 2025-12-04T11:11:09.6193686Z * [new branch] gh/fduwjj/211/base -> origin/gh/fduwjj/211/base 2025-12-04T11:11:09.6193756Z * [new branch] gh/fduwjj/211/head -> origin/gh/fduwjj/211/head 2025-12-04T11:11:09.6193825Z * [new branch] gh/fduwjj/211/orig -> origin/gh/fduwjj/211/orig 2025-12-04T11:11:09.6193894Z * [new branch] gh/fduwjj/212/base -> origin/gh/fduwjj/212/base 2025-12-04T11:11:09.6193962Z * [new branch] gh/fduwjj/212/head -> origin/gh/fduwjj/212/head 2025-12-04T11:11:09.6194031Z * [new branch] gh/fduwjj/212/orig -> origin/gh/fduwjj/212/orig 2025-12-04T11:11:09.6194104Z * [new branch] gh/fduwjj/213/base -> origin/gh/fduwjj/213/base 2025-12-04T11:11:09.6194171Z * [new branch] gh/fduwjj/213/head -> origin/gh/fduwjj/213/head 2025-12-04T11:11:09.6194240Z * [new branch] gh/fduwjj/213/orig -> origin/gh/fduwjj/213/orig 2025-12-04T11:11:09.6194309Z * [new branch] gh/fduwjj/226/base -> origin/gh/fduwjj/226/base 2025-12-04T11:11:09.6194377Z * [new branch] gh/fduwjj/226/head -> origin/gh/fduwjj/226/head 2025-12-04T11:11:09.6194446Z * [new branch] gh/fduwjj/226/orig -> origin/gh/fduwjj/226/orig 2025-12-04T11:11:09.6194519Z * [new branch] gh/fduwjj/229/base -> origin/gh/fduwjj/229/base 2025-12-04T11:11:09.6194587Z * [new branch] gh/fduwjj/229/head -> origin/gh/fduwjj/229/head 2025-12-04T11:11:09.6194656Z * [new branch] gh/fduwjj/229/orig -> origin/gh/fduwjj/229/orig 2025-12-04T11:11:09.6194727Z * [new branch] gh/fduwjj/233/base -> origin/gh/fduwjj/233/base 2025-12-04T11:11:09.6194795Z * [new branch] gh/fduwjj/233/head -> origin/gh/fduwjj/233/head 2025-12-04T11:11:09.6194862Z * [new branch] gh/fduwjj/233/orig -> origin/gh/fduwjj/233/orig 2025-12-04T11:11:09.6194935Z * [new branch] gh/fduwjj/234/base -> origin/gh/fduwjj/234/base 2025-12-04T11:11:09.6195005Z * [new branch] gh/fduwjj/234/head -> origin/gh/fduwjj/234/head 2025-12-04T11:11:09.6195075Z * [new branch] gh/fduwjj/234/orig -> origin/gh/fduwjj/234/orig 2025-12-04T11:11:09.6195148Z * [new branch] gh/fduwjj/235/base -> origin/gh/fduwjj/235/base 2025-12-04T11:11:09.6195218Z * [new branch] gh/fduwjj/235/head -> origin/gh/fduwjj/235/head 2025-12-04T11:11:09.6195289Z * [new branch] gh/fduwjj/235/orig -> origin/gh/fduwjj/235/orig 2025-12-04T11:11:09.6195363Z * [new branch] gh/fduwjj/236/base -> origin/gh/fduwjj/236/base 2025-12-04T11:11:09.6195432Z * [new branch] gh/fduwjj/236/head -> origin/gh/fduwjj/236/head 2025-12-04T11:11:09.6195507Z * [new branch] gh/fduwjj/236/orig -> origin/gh/fduwjj/236/orig 2025-12-04T11:11:09.6195575Z * [new branch] gh/fduwjj/237/base -> origin/gh/fduwjj/237/base 2025-12-04T11:11:09.6195644Z * [new branch] gh/fduwjj/237/head -> origin/gh/fduwjj/237/head 2025-12-04T11:11:09.6195735Z * [new branch] gh/fduwjj/237/orig -> origin/gh/fduwjj/237/orig 2025-12-04T11:11:09.6195806Z * [new branch] gh/fduwjj/238/base -> origin/gh/fduwjj/238/base 2025-12-04T11:11:09.6195874Z * [new branch] gh/fduwjj/238/head -> origin/gh/fduwjj/238/head 2025-12-04T11:11:09.6195970Z * [new branch] gh/fduwjj/238/orig -> origin/gh/fduwjj/238/orig 2025-12-04T11:11:09.6196040Z * [new branch] gh/fduwjj/239/base -> origin/gh/fduwjj/239/base 2025-12-04T11:11:09.6196108Z * [new branch] gh/fduwjj/239/head -> origin/gh/fduwjj/239/head 2025-12-04T11:11:09.6196181Z * [new branch] gh/fduwjj/239/orig -> origin/gh/fduwjj/239/orig 2025-12-04T11:11:09.6196254Z * [new branch] gh/fegin/332/base -> origin/gh/fegin/332/base 2025-12-04T11:11:09.6196323Z * [new branch] gh/fegin/332/head -> origin/gh/fegin/332/head 2025-12-04T11:11:09.6196398Z * [new branch] gh/fegin/332/orig -> origin/gh/fegin/332/orig 2025-12-04T11:11:09.6196467Z * [new branch] gh/fegin/333/base -> origin/gh/fegin/333/base 2025-12-04T11:11:09.6196533Z * [new branch] gh/fegin/333/head -> origin/gh/fegin/333/head 2025-12-04T11:11:09.6196603Z * [new branch] gh/fegin/333/orig -> origin/gh/fegin/333/orig 2025-12-04T11:11:09.6196670Z * [new branch] gh/fegin/334/base -> origin/gh/fegin/334/base 2025-12-04T11:11:09.6196738Z * [new branch] gh/fegin/334/head -> origin/gh/fegin/334/head 2025-12-04T11:11:09.6196807Z * [new branch] gh/fegin/334/orig -> origin/gh/fegin/334/orig 2025-12-04T11:11:09.6196876Z * [new branch] gh/fegin/335/base -> origin/gh/fegin/335/base 2025-12-04T11:11:09.6196943Z * [new branch] gh/fegin/335/head -> origin/gh/fegin/335/head 2025-12-04T11:11:09.6197016Z * [new branch] gh/fegin/335/orig -> origin/gh/fegin/335/orig 2025-12-04T11:11:09.6197086Z * [new branch] gh/fffrog/160/base -> origin/gh/fffrog/160/base 2025-12-04T11:11:09.6197160Z * [new branch] gh/fffrog/160/head -> origin/gh/fffrog/160/head 2025-12-04T11:11:09.6197232Z * [new branch] gh/fffrog/177/base -> origin/gh/fffrog/177/base 2025-12-04T11:11:09.6197301Z * [new branch] gh/fffrog/177/head -> origin/gh/fffrog/177/head 2025-12-04T11:11:09.6197373Z * [new branch] gh/fffrog/177/orig -> origin/gh/fffrog/177/orig 2025-12-04T11:11:09.6197442Z * [new branch] gh/fffrog/178/base -> origin/gh/fffrog/178/base 2025-12-04T11:11:09.6197511Z * [new branch] gh/fffrog/178/head -> origin/gh/fffrog/178/head 2025-12-04T11:11:09.6197580Z * [new branch] gh/fffrog/178/orig -> origin/gh/fffrog/178/orig 2025-12-04T11:11:09.6197649Z * [new branch] gh/fffrog/181/base -> origin/gh/fffrog/181/base 2025-12-04T11:11:09.6197718Z * [new branch] gh/fffrog/181/head -> origin/gh/fffrog/181/head 2025-12-04T11:11:09.6197788Z * [new branch] gh/fffrog/181/orig -> origin/gh/fffrog/181/orig 2025-12-04T11:11:09.6197859Z * [new branch] gh/fffrog/183/base -> origin/gh/fffrog/183/base 2025-12-04T11:11:09.6197926Z * [new branch] gh/fffrog/183/head -> origin/gh/fffrog/183/head 2025-12-04T11:11:09.6197999Z * [new branch] gh/fffrog/183/orig -> origin/gh/fffrog/183/orig 2025-12-04T11:11:09.6198068Z * [new branch] gh/fxdawnn/10/base -> origin/gh/fxdawnn/10/base 2025-12-04T11:11:09.6198137Z * [new branch] gh/fxdawnn/10/head -> origin/gh/fxdawnn/10/head 2025-12-04T11:11:09.6198244Z * [new branch] gh/fxdawnn/10/orig -> origin/gh/fxdawnn/10/orig 2025-12-04T11:11:09.6198313Z * [new branch] gh/fxdawnn/11/base -> origin/gh/fxdawnn/11/base 2025-12-04T11:11:09.6198419Z * [new branch] gh/fxdawnn/11/head -> origin/gh/fxdawnn/11/head 2025-12-04T11:11:09.6198493Z * [new branch] gh/fxdawnn/11/orig -> origin/gh/fxdawnn/11/orig 2025-12-04T11:11:09.6198817Z * [new branch] gh/fxdawnn/12/base -> origin/gh/fxdawnn/12/base 2025-12-04T11:11:09.6198888Z * [new branch] gh/fxdawnn/12/head -> origin/gh/fxdawnn/12/head 2025-12-04T11:11:09.6198956Z * [new branch] gh/fxdawnn/12/orig -> origin/gh/fxdawnn/12/orig 2025-12-04T11:11:09.6199025Z * [new branch] gh/fxdawnn/13/base -> origin/gh/fxdawnn/13/base 2025-12-04T11:11:09.6199097Z * [new branch] gh/fxdawnn/13/head -> origin/gh/fxdawnn/13/head 2025-12-04T11:11:09.6199166Z * [new branch] gh/fxdawnn/13/orig -> origin/gh/fxdawnn/13/orig 2025-12-04T11:11:09.6199233Z * [new branch] gh/fxdawnn/14/base -> origin/gh/fxdawnn/14/base 2025-12-04T11:11:09.6199310Z * [new branch] gh/fxdawnn/14/head -> origin/gh/fxdawnn/14/head 2025-12-04T11:11:09.6199378Z * [new branch] gh/fxdawnn/14/orig -> origin/gh/fxdawnn/14/orig 2025-12-04T11:11:09.6199448Z * [new branch] gh/fxdawnn/15/base -> origin/gh/fxdawnn/15/base 2025-12-04T11:11:09.6199518Z * [new branch] gh/fxdawnn/15/head -> origin/gh/fxdawnn/15/head 2025-12-04T11:11:09.6199585Z * [new branch] gh/fxdawnn/15/orig -> origin/gh/fxdawnn/15/orig 2025-12-04T11:11:09.6199655Z * [new branch] gh/fxdawnn/6/base -> origin/gh/fxdawnn/6/base 2025-12-04T11:11:09.6199726Z * [new branch] gh/fxdawnn/6/head -> origin/gh/fxdawnn/6/head 2025-12-04T11:11:09.6199793Z * [new branch] gh/fxdawnn/6/orig -> origin/gh/fxdawnn/6/orig 2025-12-04T11:11:09.6199861Z * [new branch] gh/fxdawnn/7/base -> origin/gh/fxdawnn/7/base 2025-12-04T11:11:09.6199937Z * [new branch] gh/fxdawnn/7/head -> origin/gh/fxdawnn/7/head 2025-12-04T11:11:09.6200005Z * [new branch] gh/fxdawnn/7/orig -> origin/gh/fxdawnn/7/orig 2025-12-04T11:11:09.6200074Z * [new branch] gh/fxdawnn/9/base -> origin/gh/fxdawnn/9/base 2025-12-04T11:11:09.6200143Z * [new branch] gh/fxdawnn/9/head -> origin/gh/fxdawnn/9/head 2025-12-04T11:11:09.6200210Z * [new branch] gh/fxdawnn/9/orig -> origin/gh/fxdawnn/9/orig 2025-12-04T11:11:09.6200278Z * [new branch] gh/galv/1/base -> origin/gh/galv/1/base 2025-12-04T11:11:09.6200345Z * [new branch] gh/galv/1/head -> origin/gh/galv/1/head 2025-12-04T11:11:09.6200411Z * [new branch] gh/galv/1/orig -> origin/gh/galv/1/orig 2025-12-04T11:11:09.6200476Z * [new branch] gh/galv/2/base -> origin/gh/galv/2/base 2025-12-04T11:11:09.6200547Z * [new branch] gh/galv/2/head -> origin/gh/galv/2/head 2025-12-04T11:11:09.6200611Z * [new branch] gh/galv/2/orig -> origin/gh/galv/2/orig 2025-12-04T11:11:09.6200678Z * [new branch] gh/galv/3/base -> origin/gh/galv/3/base 2025-12-04T11:11:09.6200744Z * [new branch] gh/galv/3/head -> origin/gh/galv/3/head 2025-12-04T11:11:09.6200809Z * [new branch] gh/galv/3/orig -> origin/gh/galv/3/orig 2025-12-04T11:11:09.6200892Z * [new branch] gh/guangyey/134/base -> origin/gh/guangyey/134/base 2025-12-04T11:11:09.6200969Z * [new branch] gh/guangyey/134/head -> origin/gh/guangyey/134/head 2025-12-04T11:11:09.6201043Z * [new branch] gh/guangyey/134/orig -> origin/gh/guangyey/134/orig 2025-12-04T11:11:09.6201118Z * [new branch] gh/guangyey/163/base -> origin/gh/guangyey/163/base 2025-12-04T11:11:09.6201209Z * [new branch] gh/guangyey/163/head -> origin/gh/guangyey/163/head 2025-12-04T11:11:09.6201281Z * [new branch] gh/guangyey/163/orig -> origin/gh/guangyey/163/orig 2025-12-04T11:11:09.6201354Z * [new branch] gh/guangyey/168/base -> origin/gh/guangyey/168/base 2025-12-04T11:11:09.6201453Z * [new branch] gh/guangyey/168/head -> origin/gh/guangyey/168/head 2025-12-04T11:11:09.6201524Z * [new branch] gh/guangyey/168/orig -> origin/gh/guangyey/168/orig 2025-12-04T11:11:09.6201598Z * [new branch] gh/guangyey/169/base -> origin/gh/guangyey/169/base 2025-12-04T11:11:09.6201668Z * [new branch] gh/guangyey/169/head -> origin/gh/guangyey/169/head 2025-12-04T11:11:09.6201739Z * [new branch] gh/guangyey/169/orig -> origin/gh/guangyey/169/orig 2025-12-04T11:11:09.6201813Z * [new branch] gh/guangyey/170/base -> origin/gh/guangyey/170/base 2025-12-04T11:11:09.6201885Z * [new branch] gh/guangyey/170/head -> origin/gh/guangyey/170/head 2025-12-04T11:11:09.6201955Z * [new branch] gh/guangyey/170/orig -> origin/gh/guangyey/170/orig 2025-12-04T11:11:09.6202031Z * [new branch] gh/guangyey/171/base -> origin/gh/guangyey/171/base 2025-12-04T11:11:09.6202103Z * [new branch] gh/guangyey/171/head -> origin/gh/guangyey/171/head 2025-12-04T11:11:09.6202175Z * [new branch] gh/guangyey/171/orig -> origin/gh/guangyey/171/orig 2025-12-04T11:11:09.6202250Z * [new branch] gh/guangyey/178/base -> origin/gh/guangyey/178/base 2025-12-04T11:11:09.6202322Z * [new branch] gh/guangyey/178/head -> origin/gh/guangyey/178/head 2025-12-04T11:11:09.6202397Z * [new branch] gh/guangyey/178/orig -> origin/gh/guangyey/178/orig 2025-12-04T11:11:09.6202468Z * [new branch] gh/guangyey/182/base -> origin/gh/guangyey/182/base 2025-12-04T11:11:09.6202539Z * [new branch] gh/guangyey/182/head -> origin/gh/guangyey/182/head 2025-12-04T11:11:09.6202611Z * [new branch] gh/guangyey/182/orig -> origin/gh/guangyey/182/orig 2025-12-04T11:11:09.6202683Z * [new branch] gh/guangyey/183/base -> origin/gh/guangyey/183/base 2025-12-04T11:11:09.6202753Z * [new branch] gh/guangyey/183/head -> origin/gh/guangyey/183/head 2025-12-04T11:11:09.6202826Z * [new branch] gh/guangyey/183/orig -> origin/gh/guangyey/183/orig 2025-12-04T11:11:09.6202897Z * [new branch] gh/guangyey/185/base -> origin/gh/guangyey/185/base 2025-12-04T11:11:09.6202967Z * [new branch] gh/guangyey/185/head -> origin/gh/guangyey/185/head 2025-12-04T11:11:09.6203039Z * [new branch] gh/guangyey/185/orig -> origin/gh/guangyey/185/orig 2025-12-04T11:11:09.6203110Z * [new branch] gh/guangyey/186/base -> origin/gh/guangyey/186/base 2025-12-04T11:11:09.6203184Z * [new branch] gh/guangyey/186/head -> origin/gh/guangyey/186/head 2025-12-04T11:11:09.6203257Z * [new branch] gh/guangyey/186/orig -> origin/gh/guangyey/186/orig 2025-12-04T11:11:09.6203329Z * [new branch] gh/guangyey/187/base -> origin/gh/guangyey/187/base 2025-12-04T11:11:09.6203399Z * [new branch] gh/guangyey/187/head -> origin/gh/guangyey/187/head 2025-12-04T11:11:09.6203470Z * [new branch] gh/guangyey/187/orig -> origin/gh/guangyey/187/orig 2025-12-04T11:11:09.6203539Z * [new branch] gh/guangyey/188/base -> origin/gh/guangyey/188/base 2025-12-04T11:11:09.6203609Z * [new branch] gh/guangyey/188/head -> origin/gh/guangyey/188/head 2025-12-04T11:11:09.6203680Z * [new branch] gh/guangyey/188/orig -> origin/gh/guangyey/188/orig 2025-12-04T11:11:09.6203750Z * [new branch] gh/guangyey/190/base -> origin/gh/guangyey/190/base 2025-12-04T11:11:09.6203842Z * [new branch] gh/guangyey/190/head -> origin/gh/guangyey/190/head 2025-12-04T11:11:09.6203912Z * [new branch] gh/guangyey/190/orig -> origin/gh/guangyey/190/orig 2025-12-04T11:11:09.6204036Z * [new branch] gh/guangyey/208/base -> origin/gh/guangyey/208/base 2025-12-04T11:11:09.6204110Z * [new branch] gh/guangyey/208/head -> origin/gh/guangyey/208/head 2025-12-04T11:11:09.6204181Z * [new branch] gh/guangyey/208/orig -> origin/gh/guangyey/208/orig 2025-12-04T11:11:09.6204251Z * [new branch] gh/guangyey/228/base -> origin/gh/guangyey/228/base 2025-12-04T11:11:09.6204324Z * [new branch] gh/guangyey/228/head -> origin/gh/guangyey/228/head 2025-12-04T11:11:09.6204395Z * [new branch] gh/guangyey/228/orig -> origin/gh/guangyey/228/orig 2025-12-04T11:11:09.6204466Z * [new branch] gh/guangyey/230/base -> origin/gh/guangyey/230/base 2025-12-04T11:11:09.6204541Z * [new branch] gh/guangyey/230/head -> origin/gh/guangyey/230/head 2025-12-04T11:11:09.6204613Z * [new branch] gh/guangyey/230/orig -> origin/gh/guangyey/230/orig 2025-12-04T11:11:09.6204688Z * [new branch] gh/guangyey/231/base -> origin/gh/guangyey/231/base 2025-12-04T11:11:09.6204763Z * [new branch] gh/guangyey/231/head -> origin/gh/guangyey/231/head 2025-12-04T11:11:09.6204836Z * [new branch] gh/guangyey/231/orig -> origin/gh/guangyey/231/orig 2025-12-04T11:11:09.6204908Z * [new branch] gh/guangyey/232/base -> origin/gh/guangyey/232/base 2025-12-04T11:11:09.6204985Z * [new branch] gh/guangyey/232/head -> origin/gh/guangyey/232/head 2025-12-04T11:11:09.6205057Z * [new branch] gh/guangyey/232/orig -> origin/gh/guangyey/232/orig 2025-12-04T11:11:09.6205128Z * [new branch] gh/guangyey/233/base -> origin/gh/guangyey/233/base 2025-12-04T11:11:09.6205206Z * [new branch] gh/guangyey/233/head -> origin/gh/guangyey/233/head 2025-12-04T11:11:09.6205279Z * [new branch] gh/guangyey/233/orig -> origin/gh/guangyey/233/orig 2025-12-04T11:11:09.6205357Z * [new branch] gh/guangyey/234/base -> origin/gh/guangyey/234/base 2025-12-04T11:11:09.6205429Z * [new branch] gh/guangyey/234/head -> origin/gh/guangyey/234/head 2025-12-04T11:11:09.6205503Z * [new branch] gh/guangyey/234/orig -> origin/gh/guangyey/234/orig 2025-12-04T11:11:09.6205579Z * [new branch] gh/guangyey/235/base -> origin/gh/guangyey/235/base 2025-12-04T11:11:09.6205652Z * [new branch] gh/guangyey/235/head -> origin/gh/guangyey/235/head 2025-12-04T11:11:09.6205723Z * [new branch] gh/guangyey/235/orig -> origin/gh/guangyey/235/orig 2025-12-04T11:11:09.6205798Z * [new branch] gh/guangyey/236/base -> origin/gh/guangyey/236/base 2025-12-04T11:11:09.6205868Z * [new branch] gh/guangyey/236/head -> origin/gh/guangyey/236/head 2025-12-04T11:11:09.6205939Z * [new branch] gh/guangyey/236/orig -> origin/gh/guangyey/236/orig 2025-12-04T11:11:09.6206015Z * [new branch] gh/guangyey/237/base -> origin/gh/guangyey/237/base 2025-12-04T11:11:09.6206086Z * [new branch] gh/guangyey/237/head -> origin/gh/guangyey/237/head 2025-12-04T11:11:09.6206159Z * [new branch] gh/guangyey/237/orig -> origin/gh/guangyey/237/orig 2025-12-04T11:11:09.6206232Z * [new branch] gh/guangyey/238/base -> origin/gh/guangyey/238/base 2025-12-04T11:11:09.6206304Z * [new branch] gh/guangyey/238/head -> origin/gh/guangyey/238/head 2025-12-04T11:11:09.6206375Z * [new branch] gh/guangyey/239/base -> origin/gh/guangyey/239/base 2025-12-04T11:11:09.6206485Z * [new branch] gh/guangyey/239/head -> origin/gh/guangyey/239/head 2025-12-04T11:11:09.6206557Z * [new branch] gh/guangyey/239/orig -> origin/gh/guangyey/239/orig 2025-12-04T11:11:09.6206629Z * [new branch] gh/guangyey/240/base -> origin/gh/guangyey/240/base 2025-12-04T11:11:09.6206728Z * [new branch] gh/guangyey/240/head -> origin/gh/guangyey/240/head 2025-12-04T11:11:09.6206800Z * [new branch] gh/guangyey/240/orig -> origin/gh/guangyey/240/orig 2025-12-04T11:11:09.6206872Z * [new branch] gh/guangyey/241/base -> origin/gh/guangyey/241/base 2025-12-04T11:11:09.6206946Z * [new branch] gh/guangyey/241/head -> origin/gh/guangyey/241/head 2025-12-04T11:11:09.6207017Z * [new branch] gh/guangyey/241/orig -> origin/gh/guangyey/241/orig 2025-12-04T11:11:09.6207091Z * [new branch] gh/guangyey/242/base -> origin/gh/guangyey/242/base 2025-12-04T11:11:09.6207163Z * [new branch] gh/guangyey/242/head -> origin/gh/guangyey/242/head 2025-12-04T11:11:09.6207235Z * [new branch] gh/guangyey/242/orig -> origin/gh/guangyey/242/orig 2025-12-04T11:11:09.6207307Z * [new branch] gh/guangyey/243/base -> origin/gh/guangyey/243/base 2025-12-04T11:11:09.6207379Z * [new branch] gh/guangyey/243/head -> origin/gh/guangyey/243/head 2025-12-04T11:11:09.6207451Z * [new branch] gh/guangyey/243/orig -> origin/gh/guangyey/243/orig 2025-12-04T11:11:09.6207526Z * [new branch] gh/guangyey/244/base -> origin/gh/guangyey/244/base 2025-12-04T11:11:09.6207598Z * [new branch] gh/guangyey/244/head -> origin/gh/guangyey/244/head 2025-12-04T11:11:09.6207668Z * [new branch] gh/guangyey/244/orig -> origin/gh/guangyey/244/orig 2025-12-04T11:11:09.6207741Z * [new branch] gh/guangyey/245/base -> origin/gh/guangyey/245/base 2025-12-04T11:11:09.6207813Z * [new branch] gh/guangyey/245/head -> origin/gh/guangyey/245/head 2025-12-04T11:11:09.6207884Z * [new branch] gh/guangyey/245/orig -> origin/gh/guangyey/245/orig 2025-12-04T11:11:09.6207959Z * [new branch] gh/guangyey/246/base -> origin/gh/guangyey/246/base 2025-12-04T11:11:09.6208031Z * [new branch] gh/guangyey/246/head -> origin/gh/guangyey/246/head 2025-12-04T11:11:09.6208103Z * [new branch] gh/guangyey/246/orig -> origin/gh/guangyey/246/orig 2025-12-04T11:11:09.6208227Z * [new branch] gh/guangyey/247/base -> origin/gh/guangyey/247/base 2025-12-04T11:11:09.6208301Z * [new branch] gh/guangyey/247/head -> origin/gh/guangyey/247/head 2025-12-04T11:11:09.6208375Z * [new branch] gh/guangyey/247/orig -> origin/gh/guangyey/247/orig 2025-12-04T11:11:09.6208452Z * [new branch] gh/guangyey/248/base -> origin/gh/guangyey/248/base 2025-12-04T11:11:09.6208526Z * [new branch] gh/guangyey/248/head -> origin/gh/guangyey/248/head 2025-12-04T11:11:09.6208603Z * [new branch] gh/guangyey/248/orig -> origin/gh/guangyey/248/orig 2025-12-04T11:11:09.6208676Z * [new branch] gh/guangyey/249/base -> origin/gh/guangyey/249/base 2025-12-04T11:11:09.6208750Z * [new branch] gh/guangyey/249/head -> origin/gh/guangyey/249/head 2025-12-04T11:11:09.6208827Z * [new branch] gh/guangyey/249/orig -> origin/gh/guangyey/249/orig 2025-12-04T11:11:09.6208900Z * [new branch] gh/guangyey/250/base -> origin/gh/guangyey/250/base 2025-12-04T11:11:09.6208975Z * [new branch] gh/guangyey/250/head -> origin/gh/guangyey/250/head 2025-12-04T11:11:09.6209051Z * [new branch] gh/guangyey/250/orig -> origin/gh/guangyey/250/orig 2025-12-04T11:11:09.6209124Z * [new branch] gh/guangyey/251/base -> origin/gh/guangyey/251/base 2025-12-04T11:11:09.6209221Z * [new branch] gh/guangyey/251/head -> origin/gh/guangyey/251/head 2025-12-04T11:11:09.6209299Z * [new branch] gh/guangyey/251/orig -> origin/gh/guangyey/251/orig 2025-12-04T11:11:09.6209399Z * [new branch] gh/guangyey/252/base -> origin/gh/guangyey/252/base 2025-12-04T11:11:09.6209472Z * [new branch] gh/guangyey/252/head -> origin/gh/guangyey/252/head 2025-12-04T11:11:09.6209551Z * [new branch] gh/guangyey/252/orig -> origin/gh/guangyey/252/orig 2025-12-04T11:11:09.6209625Z * [new branch] gh/guangyey/253/base -> origin/gh/guangyey/253/base 2025-12-04T11:11:09.6209698Z * [new branch] gh/guangyey/253/head -> origin/gh/guangyey/253/head 2025-12-04T11:11:09.6209775Z * [new branch] gh/guangyey/253/orig -> origin/gh/guangyey/253/orig 2025-12-04T11:11:09.6209848Z * [new branch] gh/guangyey/254/base -> origin/gh/guangyey/254/base 2025-12-04T11:11:09.6209921Z * [new branch] gh/guangyey/254/head -> origin/gh/guangyey/254/head 2025-12-04T11:11:09.6210001Z * [new branch] gh/guangyey/254/orig -> origin/gh/guangyey/254/orig 2025-12-04T11:11:09.6210076Z * [new branch] gh/guangyey/255/base -> origin/gh/guangyey/255/base 2025-12-04T11:11:09.6210153Z * [new branch] gh/guangyey/255/head -> origin/gh/guangyey/255/head 2025-12-04T11:11:09.6210226Z * [new branch] gh/guangyey/255/orig -> origin/gh/guangyey/255/orig 2025-12-04T11:11:09.6210299Z * [new branch] gh/guangyey/256/base -> origin/gh/guangyey/256/base 2025-12-04T11:11:09.6210378Z * [new branch] gh/guangyey/256/head -> origin/gh/guangyey/256/head 2025-12-04T11:11:09.6210448Z * [new branch] gh/guangyey/256/orig -> origin/gh/guangyey/256/orig 2025-12-04T11:11:09.6210550Z * [new branch] gh/guilhermeleobas/107/base -> origin/gh/guilhermeleobas/107/base 2025-12-04T11:11:09.6210653Z * [new branch] gh/guilhermeleobas/107/head -> origin/gh/guilhermeleobas/107/head 2025-12-04T11:11:09.6210747Z * [new branch] gh/guilhermeleobas/107/orig -> origin/gh/guilhermeleobas/107/orig 2025-12-04T11:11:09.6210840Z * [new branch] gh/guilhermeleobas/108/base -> origin/gh/guilhermeleobas/108/base 2025-12-04T11:11:09.6210935Z * [new branch] gh/guilhermeleobas/108/head -> origin/gh/guilhermeleobas/108/head 2025-12-04T11:11:09.6211025Z * [new branch] gh/guilhermeleobas/108/orig -> origin/gh/guilhermeleobas/108/orig 2025-12-04T11:11:09.6211116Z * [new branch] gh/guilhermeleobas/150/base -> origin/gh/guilhermeleobas/150/base 2025-12-04T11:11:09.6211210Z * [new branch] gh/guilhermeleobas/150/head -> origin/gh/guilhermeleobas/150/head 2025-12-04T11:11:09.6211301Z * [new branch] gh/guilhermeleobas/150/orig -> origin/gh/guilhermeleobas/150/orig 2025-12-04T11:11:09.6211392Z * [new branch] gh/guilhermeleobas/168/base -> origin/gh/guilhermeleobas/168/base 2025-12-04T11:11:09.6211487Z * [new branch] gh/guilhermeleobas/168/head -> origin/gh/guilhermeleobas/168/head 2025-12-04T11:11:09.6211578Z * [new branch] gh/guilhermeleobas/168/orig -> origin/gh/guilhermeleobas/168/orig 2025-12-04T11:11:09.6211673Z * [new branch] gh/guilhermeleobas/169/base -> origin/gh/guilhermeleobas/169/base 2025-12-04T11:11:09.6211763Z * [new branch] gh/guilhermeleobas/169/head -> origin/gh/guilhermeleobas/169/head 2025-12-04T11:11:09.6211853Z * [new branch] gh/guilhermeleobas/169/orig -> origin/gh/guilhermeleobas/169/orig 2025-12-04T11:11:09.6211945Z * [new branch] gh/guilhermeleobas/170/base -> origin/gh/guilhermeleobas/170/base 2025-12-04T11:11:09.6212035Z * [new branch] gh/guilhermeleobas/170/head -> origin/gh/guilhermeleobas/170/head 2025-12-04T11:11:09.6212151Z * [new branch] gh/guilhermeleobas/170/orig -> origin/gh/guilhermeleobas/170/orig 2025-12-04T11:11:09.6212246Z * [new branch] gh/guilhermeleobas/171/base -> origin/gh/guilhermeleobas/171/base 2025-12-04T11:11:09.6212359Z * [new branch] gh/guilhermeleobas/171/head -> origin/gh/guilhermeleobas/171/head 2025-12-04T11:11:09.6212449Z * [new branch] gh/guilhermeleobas/171/orig -> origin/gh/guilhermeleobas/171/orig 2025-12-04T11:11:09.6212544Z * [new branch] gh/guilhermeleobas/173/base -> origin/gh/guilhermeleobas/173/base 2025-12-04T11:11:09.6212634Z * [new branch] gh/guilhermeleobas/173/head -> origin/gh/guilhermeleobas/173/head 2025-12-04T11:11:09.6212723Z * [new branch] gh/guilhermeleobas/173/orig -> origin/gh/guilhermeleobas/173/orig 2025-12-04T11:11:09.6212817Z * [new branch] gh/guilhermeleobas/193/base -> origin/gh/guilhermeleobas/193/base 2025-12-04T11:11:09.6212909Z * [new branch] gh/guilhermeleobas/193/head -> origin/gh/guilhermeleobas/193/head 2025-12-04T11:11:09.6212998Z * [new branch] gh/guilhermeleobas/193/orig -> origin/gh/guilhermeleobas/193/orig 2025-12-04T11:11:09.6213094Z * [new branch] gh/guilhermeleobas/204/base -> origin/gh/guilhermeleobas/204/base 2025-12-04T11:11:09.6213184Z * [new branch] gh/guilhermeleobas/204/head -> origin/gh/guilhermeleobas/204/head 2025-12-04T11:11:09.6213279Z * [new branch] gh/guilhermeleobas/204/orig -> origin/gh/guilhermeleobas/204/orig 2025-12-04T11:11:09.6213371Z * [new branch] gh/guilhermeleobas/211/base -> origin/gh/guilhermeleobas/211/base 2025-12-04T11:11:09.6213458Z * [new branch] gh/guilhermeleobas/211/head -> origin/gh/guilhermeleobas/211/head 2025-12-04T11:11:09.6213552Z * [new branch] gh/guilhermeleobas/211/orig -> origin/gh/guilhermeleobas/211/orig 2025-12-04T11:11:09.6213644Z * [new branch] gh/guilhermeleobas/226/base -> origin/gh/guilhermeleobas/226/base 2025-12-04T11:11:09.6213735Z * [new branch] gh/guilhermeleobas/226/head -> origin/gh/guilhermeleobas/226/head 2025-12-04T11:11:09.6213830Z * [new branch] gh/guilhermeleobas/226/orig -> origin/gh/guilhermeleobas/226/orig 2025-12-04T11:11:09.6213921Z * [new branch] gh/guilhermeleobas/236/base -> origin/gh/guilhermeleobas/236/base 2025-12-04T11:11:09.6214011Z * [new branch] gh/guilhermeleobas/236/head -> origin/gh/guilhermeleobas/236/head 2025-12-04T11:11:09.6214103Z * [new branch] gh/guilhermeleobas/236/orig -> origin/gh/guilhermeleobas/236/orig 2025-12-04T11:11:09.6214193Z * [new branch] gh/guilhermeleobas/247/base -> origin/gh/guilhermeleobas/247/base 2025-12-04T11:11:09.6214282Z * [new branch] gh/guilhermeleobas/247/head -> origin/gh/guilhermeleobas/247/head 2025-12-04T11:11:09.6214378Z * [new branch] gh/guilhermeleobas/247/orig -> origin/gh/guilhermeleobas/247/orig 2025-12-04T11:11:09.6214468Z * [new branch] gh/guilhermeleobas/248/base -> origin/gh/guilhermeleobas/248/base 2025-12-04T11:11:09.6214558Z * [new branch] gh/guilhermeleobas/248/head -> origin/gh/guilhermeleobas/248/head 2025-12-04T11:11:09.6214652Z * [new branch] gh/guilhermeleobas/248/orig -> origin/gh/guilhermeleobas/248/orig 2025-12-04T11:11:09.6214740Z * [new branch] gh/guilhermeleobas/250/base -> origin/gh/guilhermeleobas/250/base 2025-12-04T11:11:09.6214833Z * [new branch] gh/guilhermeleobas/250/head -> origin/gh/guilhermeleobas/250/head 2025-12-04T11:11:09.6214922Z * [new branch] gh/guilhermeleobas/250/orig -> origin/gh/guilhermeleobas/250/orig 2025-12-04T11:11:09.6215013Z * [new branch] gh/guilhermeleobas/253/base -> origin/gh/guilhermeleobas/253/base 2025-12-04T11:11:09.6215126Z * [new branch] gh/guilhermeleobas/253/head -> origin/gh/guilhermeleobas/253/head 2025-12-04T11:11:09.6215217Z * [new branch] gh/guilhermeleobas/253/orig -> origin/gh/guilhermeleobas/253/orig 2025-12-04T11:11:09.6215307Z * [new branch] gh/guilhermeleobas/254/base -> origin/gh/guilhermeleobas/254/base 2025-12-04T11:11:09.6215420Z * [new branch] gh/guilhermeleobas/254/head -> origin/gh/guilhermeleobas/254/head 2025-12-04T11:11:09.6215510Z * [new branch] gh/guilhermeleobas/254/orig -> origin/gh/guilhermeleobas/254/orig 2025-12-04T11:11:09.6215599Z * [new branch] gh/guilhermeleobas/255/base -> origin/gh/guilhermeleobas/255/base 2025-12-04T11:11:09.6215693Z * [new branch] gh/guilhermeleobas/255/head -> origin/gh/guilhermeleobas/255/head 2025-12-04T11:11:09.6215783Z * [new branch] gh/guilhermeleobas/255/orig -> origin/gh/guilhermeleobas/255/orig 2025-12-04T11:11:09.6215874Z * [new branch] gh/guilhermeleobas/256/base -> origin/gh/guilhermeleobas/256/base 2025-12-04T11:11:09.6215968Z * [new branch] gh/guilhermeleobas/256/head -> origin/gh/guilhermeleobas/256/head 2025-12-04T11:11:09.6216058Z * [new branch] gh/guilhermeleobas/256/orig -> origin/gh/guilhermeleobas/256/orig 2025-12-04T11:11:09.6216150Z * [new branch] gh/guilhermeleobas/257/base -> origin/gh/guilhermeleobas/257/base 2025-12-04T11:11:09.6216241Z * [new branch] gh/guilhermeleobas/257/head -> origin/gh/guilhermeleobas/257/head 2025-12-04T11:11:09.6216330Z * [new branch] gh/guilhermeleobas/257/orig -> origin/gh/guilhermeleobas/257/orig 2025-12-04T11:11:09.6216425Z * [new branch] gh/guilhermeleobas/258/base -> origin/gh/guilhermeleobas/258/base 2025-12-04T11:11:09.6216516Z * [new branch] gh/guilhermeleobas/258/head -> origin/gh/guilhermeleobas/258/head 2025-12-04T11:11:09.6216606Z * [new branch] gh/guilhermeleobas/258/orig -> origin/gh/guilhermeleobas/258/orig 2025-12-04T11:11:09.6216701Z * [new branch] gh/guilhermeleobas/259/base -> origin/gh/guilhermeleobas/259/base 2025-12-04T11:11:09.6216792Z * [new branch] gh/guilhermeleobas/259/head -> origin/gh/guilhermeleobas/259/head 2025-12-04T11:11:09.6216884Z * [new branch] gh/guilhermeleobas/259/orig -> origin/gh/guilhermeleobas/259/orig 2025-12-04T11:11:09.6216981Z * [new branch] gh/guilhermeleobas/260/base -> origin/gh/guilhermeleobas/260/base 2025-12-04T11:11:09.6217071Z * [new branch] gh/guilhermeleobas/260/head -> origin/gh/guilhermeleobas/260/head 2025-12-04T11:11:09.6217162Z * [new branch] gh/guilhermeleobas/260/orig -> origin/gh/guilhermeleobas/260/orig 2025-12-04T11:11:09.6217255Z * [new branch] gh/guilhermeleobas/261/base -> origin/gh/guilhermeleobas/261/base 2025-12-04T11:11:09.6217345Z * [new branch] gh/guilhermeleobas/261/head -> origin/gh/guilhermeleobas/261/head 2025-12-04T11:11:09.6217437Z * [new branch] gh/guilhermeleobas/261/orig -> origin/gh/guilhermeleobas/261/orig 2025-12-04T11:11:09.6217532Z * [new branch] gh/guilhermeleobas/262/base -> origin/gh/guilhermeleobas/262/base 2025-12-04T11:11:09.6217624Z * [new branch] gh/guilhermeleobas/262/head -> origin/gh/guilhermeleobas/262/head 2025-12-04T11:11:09.6217711Z * [new branch] gh/guilhermeleobas/262/orig -> origin/gh/guilhermeleobas/262/orig 2025-12-04T11:11:09.6217805Z * [new branch] gh/guilhermeleobas/263/base -> origin/gh/guilhermeleobas/263/base 2025-12-04T11:11:09.6217896Z * [new branch] gh/guilhermeleobas/263/head -> origin/gh/guilhermeleobas/263/head 2025-12-04T11:11:09.6217990Z * [new branch] gh/guilhermeleobas/263/orig -> origin/gh/guilhermeleobas/263/orig 2025-12-04T11:11:09.6218081Z * [new branch] gh/guilhermeleobas/264/base -> origin/gh/guilhermeleobas/264/base 2025-12-04T11:11:09.6218253Z * [new branch] gh/guilhermeleobas/264/head -> origin/gh/guilhermeleobas/264/head 2025-12-04T11:11:09.6218349Z * [new branch] gh/guilhermeleobas/264/orig -> origin/gh/guilhermeleobas/264/orig 2025-12-04T11:11:09.6218464Z * [new branch] gh/guilhermeleobas/265/base -> origin/gh/guilhermeleobas/265/base 2025-12-04T11:11:09.6218553Z * [new branch] gh/guilhermeleobas/265/head -> origin/gh/guilhermeleobas/265/head 2025-12-04T11:11:09.6218647Z * [new branch] gh/guilhermeleobas/265/orig -> origin/gh/guilhermeleobas/265/orig 2025-12-04T11:11:09.6218738Z * [new branch] gh/guilhermeleobas/266/base -> origin/gh/guilhermeleobas/266/base 2025-12-04T11:11:09.6218828Z * [new branch] gh/guilhermeleobas/266/head -> origin/gh/guilhermeleobas/266/head 2025-12-04T11:11:09.6218922Z * [new branch] gh/guilhermeleobas/266/orig -> origin/gh/guilhermeleobas/266/orig 2025-12-04T11:11:09.6219013Z * [new branch] gh/guilhermeleobas/267/base -> origin/gh/guilhermeleobas/267/base 2025-12-04T11:11:09.6219147Z * [new branch] gh/guilhermeleobas/267/head -> origin/gh/guilhermeleobas/267/head 2025-12-04T11:11:09.6219294Z * [new branch] gh/guilhermeleobas/267/orig -> origin/gh/guilhermeleobas/267/orig 2025-12-04T11:11:09.6219382Z * [new branch] gh/hameerabbasi/1/base -> origin/gh/hameerabbasi/1/base 2025-12-04T11:11:09.6219465Z * [new branch] gh/hameerabbasi/1/head -> origin/gh/hameerabbasi/1/head 2025-12-04T11:11:09.6219548Z * [new branch] gh/hameerabbasi/2/base -> origin/gh/hameerabbasi/2/base 2025-12-04T11:11:09.6219626Z * [new branch] gh/hameerabbasi/2/head -> origin/gh/hameerabbasi/2/head 2025-12-04T11:11:09.6219707Z * [new branch] gh/hameerabbasi/2/orig -> origin/gh/hameerabbasi/2/orig 2025-12-04T11:11:09.6219785Z * [new branch] gh/hameerabbasi/3/base -> origin/gh/hameerabbasi/3/base 2025-12-04T11:11:09.6219860Z * [new branch] gh/hameerabbasi/3/head -> origin/gh/hameerabbasi/3/head 2025-12-04T11:11:09.6219941Z * [new branch] gh/hameerabbasi/3/orig -> origin/gh/hameerabbasi/3/orig 2025-12-04T11:11:09.6220019Z * [new branch] gh/hameerabbasi/4/base -> origin/gh/hameerabbasi/4/base 2025-12-04T11:11:09.6220097Z * [new branch] gh/hameerabbasi/4/head -> origin/gh/hameerabbasi/4/head 2025-12-04T11:11:09.6220177Z * [new branch] gh/hameerabbasi/4/orig -> origin/gh/hameerabbasi/4/orig 2025-12-04T11:11:09.6220248Z * [new branch] gh/huydhn/1/next -> origin/gh/huydhn/1/next 2025-12-04T11:11:09.6220318Z * [new branch] gh/huydhn/2/next -> origin/gh/huydhn/2/next 2025-12-04T11:11:09.6220391Z * [new branch] gh/huydhn/3/next -> origin/gh/huydhn/3/next 2025-12-04T11:11:09.6220460Z * [new branch] gh/huydhn/4/next -> origin/gh/huydhn/4/next 2025-12-04T11:11:09.6220530Z * [new branch] gh/huydhn/5/next -> origin/gh/huydhn/5/next 2025-12-04T11:11:09.6220602Z * [new branch] gh/huydhn/6/next -> origin/gh/huydhn/6/next 2025-12-04T11:11:09.6220672Z * [new branch] gh/int3/97/base -> origin/gh/int3/97/base 2025-12-04T11:11:09.6220741Z * [new branch] gh/int3/97/head -> origin/gh/int3/97/head 2025-12-04T11:11:09.6220817Z * [new branch] gh/isuruf/101/base -> origin/gh/isuruf/101/base 2025-12-04T11:11:09.6220889Z * [new branch] gh/isuruf/101/head -> origin/gh/isuruf/101/head 2025-12-04T11:11:09.6220960Z * [new branch] gh/isuruf/146/base -> origin/gh/isuruf/146/base 2025-12-04T11:11:09.6221031Z * [new branch] gh/isuruf/146/head -> origin/gh/isuruf/146/head 2025-12-04T11:11:09.6221101Z * [new branch] gh/isuruf/146/orig -> origin/gh/isuruf/146/orig 2025-12-04T11:11:09.6221196Z * [new branch] gh/isuruf/158/base -> origin/gh/isuruf/158/base 2025-12-04T11:11:09.6221271Z * [new branch] gh/isuruf/158/head -> origin/gh/isuruf/158/head 2025-12-04T11:11:09.6221365Z * [new branch] gh/isuruf/159/base -> origin/gh/isuruf/159/base 2025-12-04T11:11:09.6221438Z * [new branch] gh/isuruf/159/head -> origin/gh/isuruf/159/head 2025-12-04T11:11:09.6221508Z * [new branch] gh/isuruf/160/base -> origin/gh/isuruf/160/base 2025-12-04T11:11:09.6221577Z * [new branch] gh/isuruf/160/head -> origin/gh/isuruf/160/head 2025-12-04T11:11:09.6221651Z * [new branch] gh/isuruf/160/orig -> origin/gh/isuruf/160/orig 2025-12-04T11:11:09.6221723Z * [new branch] gh/isuruf/81/base -> origin/gh/isuruf/81/base 2025-12-04T11:11:09.6221794Z * [new branch] gh/isuruf/81/head -> origin/gh/isuruf/81/head 2025-12-04T11:11:09.6221869Z * [new branch] gh/isuruf/81/orig -> origin/gh/isuruf/81/orig 2025-12-04T11:11:09.6221946Z * [new branch] gh/jamesjwu/176/base -> origin/gh/jamesjwu/176/base 2025-12-04T11:11:09.6222025Z * [new branch] gh/jamesjwu/176/head -> origin/gh/jamesjwu/176/head 2025-12-04T11:11:09.6222105Z * [new branch] gh/jamesjwu/176/orig -> origin/gh/jamesjwu/176/orig 2025-12-04T11:11:09.6222176Z * [new branch] gh/jamesjwu/187/base -> origin/gh/jamesjwu/187/base 2025-12-04T11:11:09.6222250Z * [new branch] gh/jamesjwu/187/head -> origin/gh/jamesjwu/187/head 2025-12-04T11:11:09.6222326Z * [new branch] gh/jamesjwu/187/orig -> origin/gh/jamesjwu/187/orig 2025-12-04T11:11:09.6222399Z * [new branch] gh/jamesjwu/196/base -> origin/gh/jamesjwu/196/base 2025-12-04T11:11:09.6222470Z * [new branch] gh/jamesjwu/196/head -> origin/gh/jamesjwu/196/head 2025-12-04T11:11:09.6222548Z * [new branch] gh/jamesjwu/196/orig -> origin/gh/jamesjwu/196/orig 2025-12-04T11:11:09.6222621Z * [new branch] gh/jamesjwu/198/base -> origin/gh/jamesjwu/198/base 2025-12-04T11:11:09.6222695Z * [new branch] gh/jamesjwu/198/head -> origin/gh/jamesjwu/198/head 2025-12-04T11:11:09.6222775Z * [new branch] gh/jamesjwu/198/orig -> origin/gh/jamesjwu/198/orig 2025-12-04T11:11:09.6222847Z * [new branch] gh/jamesjwu/207/base -> origin/gh/jamesjwu/207/base 2025-12-04T11:11:09.6222919Z * [new branch] gh/jamesjwu/207/head -> origin/gh/jamesjwu/207/head 2025-12-04T11:11:09.6222996Z * [new branch] gh/jamesjwu/207/orig -> origin/gh/jamesjwu/207/orig 2025-12-04T11:11:09.6223069Z * [new branch] gh/jamesjwu/208/base -> origin/gh/jamesjwu/208/base 2025-12-04T11:11:09.6223147Z * [new branch] gh/jamesjwu/208/head -> origin/gh/jamesjwu/208/head 2025-12-04T11:11:09.6223218Z * [new branch] gh/jamesjwu/208/orig -> origin/gh/jamesjwu/208/orig 2025-12-04T11:11:09.6223291Z * [new branch] gh/jamesjwu/52/base -> origin/gh/jamesjwu/52/base 2025-12-04T11:11:09.6223371Z * [new branch] gh/jamesjwu/52/head -> origin/gh/jamesjwu/52/head 2025-12-04T11:11:09.6223444Z * [new branch] gh/jamesjwu/53/base -> origin/gh/jamesjwu/53/base 2025-12-04T11:11:09.6223515Z * [new branch] gh/jamesjwu/53/head -> origin/gh/jamesjwu/53/head 2025-12-04T11:11:09.6223590Z * [new branch] gh/jamesjwu/54/base -> origin/gh/jamesjwu/54/base 2025-12-04T11:11:09.6223660Z * [new branch] gh/jamesjwu/54/head -> origin/gh/jamesjwu/54/head 2025-12-04T11:11:09.6223729Z * [new branch] gh/jamesjwu/55/base -> origin/gh/jamesjwu/55/base 2025-12-04T11:11:09.6223800Z * [new branch] gh/jamesjwu/55/head -> origin/gh/jamesjwu/55/head 2025-12-04T11:11:09.6223901Z * [new branch] gh/jamesjwu/56/base -> origin/gh/jamesjwu/56/base 2025-12-04T11:11:09.6223973Z * [new branch] gh/jamesjwu/56/head -> origin/gh/jamesjwu/56/head 2025-12-04T11:11:09.6224068Z * [new branch] gh/jamesjwu/57/base -> origin/gh/jamesjwu/57/base 2025-12-04T11:11:09.6224139Z * [new branch] gh/jamesjwu/57/head -> origin/gh/jamesjwu/57/head 2025-12-04T11:11:09.6224210Z * [new branch] gh/jamesjwu/58/base -> origin/gh/jamesjwu/58/base 2025-12-04T11:11:09.6224284Z * [new branch] gh/jamesjwu/58/head -> origin/gh/jamesjwu/58/head 2025-12-04T11:11:09.6224353Z * [new branch] gh/jamesjwu/59/base -> origin/gh/jamesjwu/59/base 2025-12-04T11:11:09.6224423Z * [new branch] gh/jamesjwu/59/head -> origin/gh/jamesjwu/59/head 2025-12-04T11:11:09.6224495Z * [new branch] gh/jamesjwu/60/base -> origin/gh/jamesjwu/60/base 2025-12-04T11:11:09.6224566Z * [new branch] gh/jamesjwu/60/head -> origin/gh/jamesjwu/60/head 2025-12-04T11:11:09.6224636Z * [new branch] gh/jamesjwu/61/base -> origin/gh/jamesjwu/61/base 2025-12-04T11:11:09.6224711Z * [new branch] gh/jamesjwu/61/head -> origin/gh/jamesjwu/61/head 2025-12-04T11:11:09.6224781Z * [new branch] gh/jamesjwu/62/base -> origin/gh/jamesjwu/62/base 2025-12-04T11:11:09.6224853Z * [new branch] gh/jamesjwu/62/head -> origin/gh/jamesjwu/62/head 2025-12-04T11:11:09.6224923Z * [new branch] gh/jamesjwu/63/base -> origin/gh/jamesjwu/63/base 2025-12-04T11:11:09.6224993Z * [new branch] gh/jamesjwu/63/head -> origin/gh/jamesjwu/63/head 2025-12-04T11:11:09.6225065Z * [new branch] gh/jamesjwu/64/base -> origin/gh/jamesjwu/64/base 2025-12-04T11:11:09.6225135Z * [new branch] gh/jamesjwu/64/head -> origin/gh/jamesjwu/64/head 2025-12-04T11:11:09.6225207Z * [new branch] gh/jamesjwu/65/base -> origin/gh/jamesjwu/65/base 2025-12-04T11:11:09.6225279Z * [new branch] gh/jamesjwu/65/head -> origin/gh/jamesjwu/65/head 2025-12-04T11:11:09.6225353Z * [new branch] gh/janeyx99/165/base -> origin/gh/janeyx99/165/base 2025-12-04T11:11:09.6225424Z * [new branch] gh/janeyx99/165/head -> origin/gh/janeyx99/165/head 2025-12-04T11:11:09.6225496Z * [new branch] gh/janeyx99/165/orig -> origin/gh/janeyx99/165/orig 2025-12-04T11:11:09.6225567Z * [new branch] gh/janeyx99/201/base -> origin/gh/janeyx99/201/base 2025-12-04T11:11:09.6225638Z * [new branch] gh/janeyx99/201/head -> origin/gh/janeyx99/201/head 2025-12-04T11:11:09.6225712Z * [new branch] gh/janeyx99/201/orig -> origin/gh/janeyx99/201/orig 2025-12-04T11:11:09.6225783Z * [new branch] gh/janeyx99/225/base -> origin/gh/janeyx99/225/base 2025-12-04T11:11:09.6225855Z * [new branch] gh/janeyx99/225/head -> origin/gh/janeyx99/225/head 2025-12-04T11:11:09.6225931Z * [new branch] gh/janeyx99/225/orig -> origin/gh/janeyx99/225/orig 2025-12-04T11:11:09.6226003Z * [new branch] gh/janeyx99/299/base -> origin/gh/janeyx99/299/base 2025-12-04T11:11:09.6226076Z * [new branch] gh/janeyx99/299/head -> origin/gh/janeyx99/299/head 2025-12-04T11:11:09.6226151Z * [new branch] gh/janeyx99/299/orig -> origin/gh/janeyx99/299/orig 2025-12-04T11:11:09.6226220Z * [new branch] gh/janeyx99/302/base -> origin/gh/janeyx99/302/base 2025-12-04T11:11:09.6226292Z * [new branch] gh/janeyx99/302/head -> origin/gh/janeyx99/302/head 2025-12-04T11:11:09.6226364Z * [new branch] gh/janeyx99/303/base -> origin/gh/janeyx99/303/base 2025-12-04T11:11:09.6226454Z * [new branch] gh/janeyx99/303/head -> origin/gh/janeyx99/303/head 2025-12-04T11:11:09.6226524Z * [new branch] gh/janeyx99/305/base -> origin/gh/janeyx99/305/base 2025-12-04T11:11:09.6226594Z * [new branch] gh/janeyx99/305/head -> origin/gh/janeyx99/305/head 2025-12-04T11:11:09.6226687Z * [new branch] gh/janeyx99/306/base -> origin/gh/janeyx99/306/base 2025-12-04T11:11:09.6226760Z * [new branch] gh/janeyx99/306/head -> origin/gh/janeyx99/306/head 2025-12-04T11:11:09.6226832Z * [new branch] gh/janeyx99/314/base -> origin/gh/janeyx99/314/base 2025-12-04T11:11:09.6226903Z * [new branch] gh/janeyx99/314/head -> origin/gh/janeyx99/314/head 2025-12-04T11:11:09.6226978Z * [new branch] gh/janeyx99/314/orig -> origin/gh/janeyx99/314/orig 2025-12-04T11:11:09.6227049Z * [new branch] gh/janeyx99/315/base -> origin/gh/janeyx99/315/base 2025-12-04T11:11:09.6227123Z * [new branch] gh/janeyx99/315/head -> origin/gh/janeyx99/315/head 2025-12-04T11:11:09.6227198Z * [new branch] gh/janeyx99/315/orig -> origin/gh/janeyx99/315/orig 2025-12-04T11:11:09.6227269Z * [new branch] gh/janeyx99/316/base -> origin/gh/janeyx99/316/base 2025-12-04T11:11:09.6227342Z * [new branch] gh/janeyx99/316/head -> origin/gh/janeyx99/316/head 2025-12-04T11:11:09.6227419Z * [new branch] gh/janeyx99/316/orig -> origin/gh/janeyx99/316/orig 2025-12-04T11:11:09.6227491Z * [new branch] gh/janeyx99/317/base -> origin/gh/janeyx99/317/base 2025-12-04T11:11:09.6227564Z * [new branch] gh/janeyx99/317/head -> origin/gh/janeyx99/317/head 2025-12-04T11:11:09.6227639Z * [new branch] gh/janeyx99/317/orig -> origin/gh/janeyx99/317/orig 2025-12-04T11:11:09.6227710Z * [new branch] gh/janeyx99/325/base -> origin/gh/janeyx99/325/base 2025-12-04T11:11:09.6227782Z * [new branch] gh/janeyx99/325/head -> origin/gh/janeyx99/325/head 2025-12-04T11:11:09.6227859Z * [new branch] gh/janeyx99/325/orig -> origin/gh/janeyx99/325/orig 2025-12-04T11:11:09.6227930Z * [new branch] gh/janeyx99/327/base -> origin/gh/janeyx99/327/base 2025-12-04T11:11:09.6228008Z * [new branch] gh/janeyx99/327/head -> origin/gh/janeyx99/327/head 2025-12-04T11:11:09.6228080Z * [new branch] gh/janeyx99/327/orig -> origin/gh/janeyx99/327/orig 2025-12-04T11:11:09.6228201Z * [new branch] gh/janeyx99/328/base -> origin/gh/janeyx99/328/base 2025-12-04T11:11:09.6228277Z * [new branch] gh/janeyx99/328/head -> origin/gh/janeyx99/328/head 2025-12-04T11:11:09.6228348Z * [new branch] gh/janeyx99/328/orig -> origin/gh/janeyx99/328/orig 2025-12-04T11:11:09.6228420Z * [new branch] gh/janeyx99/329/base -> origin/gh/janeyx99/329/base 2025-12-04T11:11:09.6228497Z * [new branch] gh/janeyx99/329/head -> origin/gh/janeyx99/329/head 2025-12-04T11:11:09.6228569Z * [new branch] gh/janeyx99/329/orig -> origin/gh/janeyx99/329/orig 2025-12-04T11:11:09.6228642Z * [new branch] gh/janeyx99/330/base -> origin/gh/janeyx99/330/base 2025-12-04T11:11:09.6228720Z * [new branch] gh/janeyx99/330/head -> origin/gh/janeyx99/330/head 2025-12-04T11:11:09.6228789Z * [new branch] gh/janeyx99/330/orig -> origin/gh/janeyx99/330/orig 2025-12-04T11:11:09.6228859Z * [new branch] gh/janeyx99/331/base -> origin/gh/janeyx99/331/base 2025-12-04T11:11:09.6228938Z * [new branch] gh/janeyx99/331/head -> origin/gh/janeyx99/331/head 2025-12-04T11:11:09.6229011Z * [new branch] gh/janeyx99/331/orig -> origin/gh/janeyx99/331/orig 2025-12-04T11:11:09.6229082Z * [new branch] gh/janeyx99/332/base -> origin/gh/janeyx99/332/base 2025-12-04T11:11:09.6229189Z * [new branch] gh/janeyx99/332/head -> origin/gh/janeyx99/332/head 2025-12-04T11:11:09.6229262Z * [new branch] gh/janeyx99/332/orig -> origin/gh/janeyx99/332/orig 2025-12-04T11:11:09.6229332Z * [new branch] gh/janeyx99/333/base -> origin/gh/janeyx99/333/base 2025-12-04T11:11:09.6229436Z * [new branch] gh/janeyx99/333/head -> origin/gh/janeyx99/333/head 2025-12-04T11:11:09.6229508Z * [new branch] gh/janeyx99/333/orig -> origin/gh/janeyx99/333/orig 2025-12-04T11:11:09.6229582Z * [new branch] gh/janeyx99/88/base -> origin/gh/janeyx99/88/base 2025-12-04T11:11:09.6229653Z * [new branch] gh/janeyx99/88/head -> origin/gh/janeyx99/88/head 2025-12-04T11:11:09.6229723Z * [new branch] gh/janeyx99/88/orig -> origin/gh/janeyx99/88/orig 2025-12-04T11:11:09.6229799Z * [new branch] gh/jansel/360/base -> origin/gh/jansel/360/base 2025-12-04T11:11:09.6229874Z * [new branch] gh/jansel/360/head -> origin/gh/jansel/360/head 2025-12-04T11:11:09.6229944Z * [new branch] gh/jansel/451/base -> origin/gh/jansel/451/base 2025-12-04T11:11:09.6230017Z * [new branch] gh/jansel/451/head -> origin/gh/jansel/451/head 2025-12-04T11:11:09.6230088Z * [new branch] gh/jansel/451/orig -> origin/gh/jansel/451/orig 2025-12-04T11:11:09.6230155Z * [new branch] gh/jansel/462/base -> origin/gh/jansel/462/base 2025-12-04T11:11:09.6230227Z * [new branch] gh/jansel/462/head -> origin/gh/jansel/462/head 2025-12-04T11:11:09.6230295Z * [new branch] gh/jansel/462/orig -> origin/gh/jansel/462/orig 2025-12-04T11:11:09.6230364Z * [new branch] gh/jansel/533/base -> origin/gh/jansel/533/base 2025-12-04T11:11:09.6230436Z * [new branch] gh/jansel/533/head -> origin/gh/jansel/533/head 2025-12-04T11:11:09.6230506Z * [new branch] gh/jansel/533/orig -> origin/gh/jansel/533/orig 2025-12-04T11:11:09.6230578Z * [new branch] gh/jansel/552/base -> origin/gh/jansel/552/base 2025-12-04T11:11:09.6230651Z * [new branch] gh/jansel/552/head -> origin/gh/jansel/552/head 2025-12-04T11:11:09.6230721Z * [new branch] gh/jansel/552/orig -> origin/gh/jansel/552/orig 2025-12-04T11:11:09.6230791Z * [new branch] gh/jansel/553/base -> origin/gh/jansel/553/base 2025-12-04T11:11:09.6230865Z * [new branch] gh/jansel/553/head -> origin/gh/jansel/553/head 2025-12-04T11:11:09.6230936Z * [new branch] gh/jansel/553/orig -> origin/gh/jansel/553/orig 2025-12-04T11:11:09.6231005Z * [new branch] gh/jansel/554/base -> origin/gh/jansel/554/base 2025-12-04T11:11:09.6231079Z * [new branch] gh/jansel/554/head -> origin/gh/jansel/554/head 2025-12-04T11:11:09.6231150Z * [new branch] gh/jansel/554/orig -> origin/gh/jansel/554/orig 2025-12-04T11:11:09.6231219Z * [new branch] gh/jansel/555/base -> origin/gh/jansel/555/base 2025-12-04T11:11:09.6231293Z * [new branch] gh/jansel/555/head -> origin/gh/jansel/555/head 2025-12-04T11:11:09.6231364Z * [new branch] gh/jansel/555/orig -> origin/gh/jansel/555/orig 2025-12-04T11:11:09.6231437Z * [new branch] gh/jansel/556/base -> origin/gh/jansel/556/base 2025-12-04T11:11:09.6231507Z * [new branch] gh/jansel/556/head -> origin/gh/jansel/556/head 2025-12-04T11:11:09.6231576Z * [new branch] gh/jansel/556/orig -> origin/gh/jansel/556/orig 2025-12-04T11:11:09.6231649Z * [new branch] gh/jansel/557/base -> origin/gh/jansel/557/base 2025-12-04T11:11:09.6231719Z * [new branch] gh/jansel/557/head -> origin/gh/jansel/557/head 2025-12-04T11:11:09.6231807Z * [new branch] gh/jansel/557/orig -> origin/gh/jansel/557/orig 2025-12-04T11:11:09.6231882Z * [new branch] gh/jansel/558/base -> origin/gh/jansel/558/base 2025-12-04T11:11:09.6231953Z * [new branch] gh/jansel/558/head -> origin/gh/jansel/558/head 2025-12-04T11:11:09.6232050Z * [new branch] gh/jansel/558/orig -> origin/gh/jansel/558/orig 2025-12-04T11:11:09.6232124Z * [new branch] gh/jansel/559/base -> origin/gh/jansel/559/base 2025-12-04T11:11:09.6232194Z * [new branch] gh/jansel/559/head -> origin/gh/jansel/559/head 2025-12-04T11:11:09.6232262Z * [new branch] gh/jansel/559/orig -> origin/gh/jansel/559/orig 2025-12-04T11:11:09.6232335Z * [new branch] gh/jansel/560/base -> origin/gh/jansel/560/base 2025-12-04T11:11:09.6232405Z * [new branch] gh/jansel/560/head -> origin/gh/jansel/560/head 2025-12-04T11:11:09.6232475Z * [new branch] gh/jansel/560/orig -> origin/gh/jansel/560/orig 2025-12-04T11:11:09.6232549Z * [new branch] gh/jansel/561/base -> origin/gh/jansel/561/base 2025-12-04T11:11:09.6232619Z * [new branch] gh/jansel/561/head -> origin/gh/jansel/561/head 2025-12-04T11:11:09.6232690Z * [new branch] gh/jansel/561/orig -> origin/gh/jansel/561/orig 2025-12-04T11:11:09.6232763Z * [new branch] gh/jansel/562/base -> origin/gh/jansel/562/base 2025-12-04T11:11:09.6232832Z * [new branch] gh/jansel/562/head -> origin/gh/jansel/562/head 2025-12-04T11:11:09.6232901Z * [new branch] gh/jansel/562/orig -> origin/gh/jansel/562/orig 2025-12-04T11:11:09.6232974Z * [new branch] gh/jansel/563/base -> origin/gh/jansel/563/base 2025-12-04T11:11:09.6233043Z * [new branch] gh/jansel/563/head -> origin/gh/jansel/563/head 2025-12-04T11:11:09.6233117Z * [new branch] gh/jansel/563/orig -> origin/gh/jansel/563/orig 2025-12-04T11:11:09.6233187Z * [new branch] gh/jansel/564/base -> origin/gh/jansel/564/base 2025-12-04T11:11:09.6233257Z * [new branch] gh/jansel/564/head -> origin/gh/jansel/564/head 2025-12-04T11:11:09.6233331Z * [new branch] gh/jansel/564/orig -> origin/gh/jansel/564/orig 2025-12-04T11:11:09.6233400Z * [new branch] gh/jansel/565/base -> origin/gh/jansel/565/base 2025-12-04T11:11:09.6233469Z * [new branch] gh/jansel/565/head -> origin/gh/jansel/565/head 2025-12-04T11:11:09.6233544Z * [new branch] gh/jansel/565/orig -> origin/gh/jansel/565/orig 2025-12-04T11:11:09.6233612Z * [new branch] gh/jansel/566/base -> origin/gh/jansel/566/base 2025-12-04T11:11:09.6233683Z * [new branch] gh/jansel/566/head -> origin/gh/jansel/566/head 2025-12-04T11:11:09.6233757Z * [new branch] gh/jansel/566/orig -> origin/gh/jansel/566/orig 2025-12-04T11:11:09.6233826Z * [new branch] gh/jansel/567/base -> origin/gh/jansel/567/base 2025-12-04T11:11:09.6233896Z * [new branch] gh/jansel/567/head -> origin/gh/jansel/567/head 2025-12-04T11:11:09.6233970Z * [new branch] gh/jansel/567/orig -> origin/gh/jansel/567/orig 2025-12-04T11:11:09.6234039Z * [new branch] gh/jansel/568/base -> origin/gh/jansel/568/base 2025-12-04T11:11:09.6234109Z * [new branch] gh/jansel/568/head -> origin/gh/jansel/568/head 2025-12-04T11:11:09.6234184Z * [new branch] gh/jansel/568/orig -> origin/gh/jansel/568/orig 2025-12-04T11:11:09.6234253Z * [new branch] gh/jansel/569/base -> origin/gh/jansel/569/base 2025-12-04T11:11:09.6234322Z * [new branch] gh/jansel/569/head -> origin/gh/jansel/569/head 2025-12-04T11:11:09.6234417Z * [new branch] gh/jansel/569/orig -> origin/gh/jansel/569/orig 2025-12-04T11:11:09.6234487Z * [new branch] gh/jansel/570/base -> origin/gh/jansel/570/base 2025-12-04T11:11:09.6234557Z * [new branch] gh/jansel/570/head -> origin/gh/jansel/570/head 2025-12-04T11:11:09.6234652Z * [new branch] gh/jansel/570/orig -> origin/gh/jansel/570/orig 2025-12-04T11:11:09.6234722Z * [new branch] gh/jansel/571/base -> origin/gh/jansel/571/base 2025-12-04T11:11:09.6234796Z * [new branch] gh/jansel/571/head -> origin/gh/jansel/571/head 2025-12-04T11:11:09.6234865Z * [new branch] gh/jansel/571/orig -> origin/gh/jansel/571/orig 2025-12-04T11:11:09.6234934Z * [new branch] gh/jansel/572/base -> origin/gh/jansel/572/base 2025-12-04T11:11:09.6235008Z * [new branch] gh/jansel/572/head -> origin/gh/jansel/572/head 2025-12-04T11:11:09.6235079Z * [new branch] gh/jansel/572/orig -> origin/gh/jansel/572/orig 2025-12-04T11:11:09.6235151Z * [new branch] gh/jansel/573/base -> origin/gh/jansel/573/base 2025-12-04T11:11:09.6235224Z * [new branch] gh/jansel/573/head -> origin/gh/jansel/573/head 2025-12-04T11:11:09.6235294Z * [new branch] gh/jansel/573/orig -> origin/gh/jansel/573/orig 2025-12-04T11:11:09.6235363Z * [new branch] gh/jansel/574/base -> origin/gh/jansel/574/base 2025-12-04T11:11:09.6235433Z * [new branch] gh/jansel/574/head -> origin/gh/jansel/574/head 2025-12-04T11:11:09.6235501Z * [new branch] gh/jansel/574/orig -> origin/gh/jansel/574/orig 2025-12-04T11:11:09.6235568Z * [new branch] gh/jansel/575/base -> origin/gh/jansel/575/base 2025-12-04T11:11:09.6235637Z * [new branch] gh/jansel/575/head -> origin/gh/jansel/575/head 2025-12-04T11:11:09.6235703Z * [new branch] gh/jansel/575/orig -> origin/gh/jansel/575/orig 2025-12-04T11:11:09.6235774Z * [new branch] gh/jansel/576/base -> origin/gh/jansel/576/base 2025-12-04T11:11:09.6235848Z * [new branch] gh/jansel/576/head -> origin/gh/jansel/576/head 2025-12-04T11:11:09.6235919Z * [new branch] gh/jansel/576/orig -> origin/gh/jansel/576/orig 2025-12-04T11:11:09.6236002Z * [new branch] gh/jbschlosser/247/base -> origin/gh/jbschlosser/247/base 2025-12-04T11:11:09.6236088Z * [new branch] gh/jbschlosser/247/head -> origin/gh/jbschlosser/247/head 2025-12-04T11:11:09.6236167Z * [new branch] gh/jbschlosser/247/orig -> origin/gh/jbschlosser/247/orig 2025-12-04T11:11:09.6236244Z * [new branch] gh/jbschlosser/250/base -> origin/gh/jbschlosser/250/base 2025-12-04T11:11:09.6236325Z * [new branch] gh/jbschlosser/250/head -> origin/gh/jbschlosser/250/head 2025-12-04T11:11:09.6236403Z * [new branch] gh/jbschlosser/250/orig -> origin/gh/jbschlosser/250/orig 2025-12-04T11:11:09.6236482Z * [new branch] gh/jerryzh168/1/base -> origin/gh/jerryzh168/1/base 2025-12-04T11:11:09.6236556Z * [new branch] gh/jerryzh168/1/head -> origin/gh/jerryzh168/1/head 2025-12-04T11:11:09.6236632Z * [new branch] gh/jerryzh168/1/orig -> origin/gh/jerryzh168/1/orig 2025-12-04T11:11:09.6236712Z * [new branch] gh/jiayisunx/59/base -> origin/gh/jiayisunx/59/base 2025-12-04T11:11:09.6236786Z * [new branch] gh/jiayisunx/59/head -> origin/gh/jiayisunx/59/head 2025-12-04T11:11:09.6236861Z * [new branch] gh/jiayisunx/59/orig -> origin/gh/jiayisunx/59/orig 2025-12-04T11:11:09.6236937Z * [new branch] gh/jiayisunx/61/base -> origin/gh/jiayisunx/61/base 2025-12-04T11:11:09.6237010Z * [new branch] gh/jiayisunx/61/head -> origin/gh/jiayisunx/61/head 2025-12-04T11:11:09.6237108Z * [new branch] gh/jiayisunx/61/orig -> origin/gh/jiayisunx/61/orig 2025-12-04T11:11:09.6237185Z * [new branch] gh/jiayisunx/68/base -> origin/gh/jiayisunx/68/base 2025-12-04T11:11:09.6237258Z * [new branch] gh/jiayisunx/68/head -> origin/gh/jiayisunx/68/head 2025-12-04T11:11:09.6237354Z * [new branch] gh/jiayisunx/68/orig -> origin/gh/jiayisunx/68/orig 2025-12-04T11:11:09.6237432Z * [new branch] gh/jiayisunx/77/base -> origin/gh/jiayisunx/77/base 2025-12-04T11:11:09.6237506Z * [new branch] gh/jiayisunx/77/head -> origin/gh/jiayisunx/77/head 2025-12-04T11:11:09.6237579Z * [new branch] gh/jiayisunx/77/orig -> origin/gh/jiayisunx/77/orig 2025-12-04T11:11:09.6237656Z * [new branch] gh/jiayisunx/78/base -> origin/gh/jiayisunx/78/base 2025-12-04T11:11:09.6237730Z * [new branch] gh/jiayisunx/78/head -> origin/gh/jiayisunx/78/head 2025-12-04T11:11:09.6237805Z * [new branch] gh/jiayisunx/78/orig -> origin/gh/jiayisunx/78/orig 2025-12-04T11:11:09.6237881Z * [new branch] gh/jiayisunx/79/base -> origin/gh/jiayisunx/79/base 2025-12-04T11:11:09.6237953Z * [new branch] gh/jiayisunx/79/head -> origin/gh/jiayisunx/79/head 2025-12-04T11:11:09.6238031Z * [new branch] gh/jiayisunx/79/orig -> origin/gh/jiayisunx/79/orig 2025-12-04T11:11:09.6238103Z * [new branch] gh/jiayisunx/82/base -> origin/gh/jiayisunx/82/base 2025-12-04T11:11:09.6238210Z * [new branch] gh/jiayisunx/82/head -> origin/gh/jiayisunx/82/head 2025-12-04T11:11:09.6238290Z * [new branch] gh/jiayisunx/82/orig -> origin/gh/jiayisunx/82/orig 2025-12-04T11:11:09.6238365Z * [new branch] gh/jiayisunx/83/base -> origin/gh/jiayisunx/83/base 2025-12-04T11:11:09.6238438Z * [new branch] gh/jiayisunx/83/head -> origin/gh/jiayisunx/83/head 2025-12-04T11:11:09.6238519Z * [new branch] gh/jiayisunx/83/orig -> origin/gh/jiayisunx/83/orig 2025-12-04T11:11:09.6238593Z * [new branch] gh/jiayisunx/84/base -> origin/gh/jiayisunx/84/base 2025-12-04T11:11:09.6238668Z * [new branch] gh/jiayisunx/84/head -> origin/gh/jiayisunx/84/head 2025-12-04T11:11:09.6238748Z * [new branch] gh/jiayisunx/84/orig -> origin/gh/jiayisunx/84/orig 2025-12-04T11:11:09.6238821Z * [new branch] gh/jiayisunx/85/base -> origin/gh/jiayisunx/85/base 2025-12-04T11:11:09.6238895Z * [new branch] gh/jiayisunx/85/head -> origin/gh/jiayisunx/85/head 2025-12-04T11:11:09.6238974Z * [new branch] gh/jiayisunx/85/orig -> origin/gh/jiayisunx/85/orig 2025-12-04T11:11:09.6239049Z * [new branch] gh/jiayisunx/86/base -> origin/gh/jiayisunx/86/base 2025-12-04T11:11:09.6239122Z * [new branch] gh/jiayisunx/86/head -> origin/gh/jiayisunx/86/head 2025-12-04T11:11:09.6239197Z * [new branch] gh/jiayisunx/86/orig -> origin/gh/jiayisunx/86/orig 2025-12-04T11:11:09.6239271Z * [new branch] gh/jiayisunx/87/base -> origin/gh/jiayisunx/87/base 2025-12-04T11:11:09.6239345Z * [new branch] gh/jiayisunx/87/head -> origin/gh/jiayisunx/87/head 2025-12-04T11:11:09.6239420Z * [new branch] gh/jiayisunx/87/orig -> origin/gh/jiayisunx/87/orig 2025-12-04T11:11:09.6239494Z * [new branch] gh/jiayisunx/88/base -> origin/gh/jiayisunx/88/base 2025-12-04T11:11:09.6239566Z * [new branch] gh/jiayisunx/88/head -> origin/gh/jiayisunx/88/head 2025-12-04T11:11:09.6239643Z * [new branch] gh/jiayisunx/88/orig -> origin/gh/jiayisunx/88/orig 2025-12-04T11:11:09.6239716Z * [new branch] gh/jiayisunx/89/base -> origin/gh/jiayisunx/89/base 2025-12-04T11:11:09.6239792Z * [new branch] gh/jiayisunx/89/head -> origin/gh/jiayisunx/89/head 2025-12-04T11:11:09.6239896Z * [new branch] gh/jiayisunx/89/orig -> origin/gh/jiayisunx/89/orig 2025-12-04T11:11:09.6239969Z * [new branch] gh/jiayisunx/90/base -> origin/gh/jiayisunx/90/base 2025-12-04T11:11:09.6240071Z * [new branch] gh/jiayisunx/90/head -> origin/gh/jiayisunx/90/head 2025-12-04T11:11:09.6240144Z * [new branch] gh/jiayisunx/90/orig -> origin/gh/jiayisunx/90/orig 2025-12-04T11:11:09.6240224Z * [new branch] gh/jjwu@meta.com/1/base -> origin/gh/jjwu@meta.com/1/base 2025-12-04T11:11:09.6240306Z * [new branch] gh/jjwu@meta.com/1/head -> origin/gh/jjwu@meta.com/1/head 2025-12-04T11:11:09.6240377Z * [new branch] gh/jturney/1/base -> origin/gh/jturney/1/base 2025-12-04T11:11:09.6240448Z * [new branch] gh/jturney/1/head -> origin/gh/jturney/1/head 2025-12-04T11:11:09.6240520Z * [new branch] gh/jturney/1/orig -> origin/gh/jturney/1/orig 2025-12-04T11:11:09.6240593Z * [new branch] gh/jturney/2/base -> origin/gh/jturney/2/base 2025-12-04T11:11:09.6240663Z * [new branch] gh/jturney/2/head -> origin/gh/jturney/2/head 2025-12-04T11:11:09.6240739Z * [new branch] gh/jturney/2/orig -> origin/gh/jturney/2/orig 2025-12-04T11:11:09.6240818Z * [new branch] gh/karthickai/10/base -> origin/gh/karthickai/10/base 2025-12-04T11:11:09.6240896Z * [new branch] gh/karthickai/10/head -> origin/gh/karthickai/10/head 2025-12-04T11:11:09.6240977Z * [new branch] gh/karthickai/10/orig -> origin/gh/karthickai/10/orig 2025-12-04T11:11:09.6241056Z * [new branch] gh/karthickai/11/base -> origin/gh/karthickai/11/base 2025-12-04T11:11:09.6241132Z * [new branch] gh/karthickai/11/head -> origin/gh/karthickai/11/head 2025-12-04T11:11:09.6241214Z * [new branch] gh/karthickai/11/orig -> origin/gh/karthickai/11/orig 2025-12-04T11:11:09.6241290Z * [new branch] gh/karthickai/12/base -> origin/gh/karthickai/12/base 2025-12-04T11:11:09.6241369Z * [new branch] gh/karthickai/12/head -> origin/gh/karthickai/12/head 2025-12-04T11:11:09.6241445Z * [new branch] gh/karthickai/12/orig -> origin/gh/karthickai/12/orig 2025-12-04T11:11:09.6241519Z * [new branch] gh/karthickai/13/base -> origin/gh/karthickai/13/base 2025-12-04T11:11:09.6241598Z * [new branch] gh/karthickai/13/head -> origin/gh/karthickai/13/head 2025-12-04T11:11:09.6241673Z * [new branch] gh/karthickai/13/orig -> origin/gh/karthickai/13/orig 2025-12-04T11:11:09.6241748Z * [new branch] gh/karthickai/14/base -> origin/gh/karthickai/14/base 2025-12-04T11:11:09.6241825Z * [new branch] gh/karthickai/14/head -> origin/gh/karthickai/14/head 2025-12-04T11:11:09.6241901Z * [new branch] gh/karthickai/14/orig -> origin/gh/karthickai/14/orig 2025-12-04T11:11:09.6241978Z * [new branch] gh/karthickai/15/base -> origin/gh/karthickai/15/base 2025-12-04T11:11:09.6242057Z * [new branch] gh/karthickai/15/head -> origin/gh/karthickai/15/head 2025-12-04T11:11:09.6242133Z * [new branch] gh/karthickai/15/orig -> origin/gh/karthickai/15/orig 2025-12-04T11:11:09.6242208Z * [new branch] gh/karthickai/16/base -> origin/gh/karthickai/16/base 2025-12-04T11:11:09.6242287Z * [new branch] gh/karthickai/16/head -> origin/gh/karthickai/16/head 2025-12-04T11:11:09.6242363Z * [new branch] gh/karthickai/16/orig -> origin/gh/karthickai/16/orig 2025-12-04T11:11:09.6242438Z * [new branch] gh/karthickai/17/base -> origin/gh/karthickai/17/base 2025-12-04T11:11:09.6242516Z * [new branch] gh/karthickai/17/head -> origin/gh/karthickai/17/head 2025-12-04T11:11:09.6242613Z * [new branch] gh/karthickai/17/orig -> origin/gh/karthickai/17/orig 2025-12-04T11:11:09.6242689Z * [new branch] gh/karthickai/18/base -> origin/gh/karthickai/18/base 2025-12-04T11:11:09.6242766Z * [new branch] gh/karthickai/18/head -> origin/gh/karthickai/18/head 2025-12-04T11:11:09.6242872Z * [new branch] gh/karthickai/18/orig -> origin/gh/karthickai/18/orig 2025-12-04T11:11:09.6242951Z * [new branch] gh/karthickai/19/base -> origin/gh/karthickai/19/base 2025-12-04T11:11:09.6243024Z * [new branch] gh/karthickai/19/head -> origin/gh/karthickai/19/head 2025-12-04T11:11:09.6243098Z * [new branch] gh/karthickai/19/orig -> origin/gh/karthickai/19/orig 2025-12-04T11:11:09.6243172Z * [new branch] gh/karthickai/20/base -> origin/gh/karthickai/20/base 2025-12-04T11:11:09.6243247Z * [new branch] gh/karthickai/20/head -> origin/gh/karthickai/20/head 2025-12-04T11:11:09.6243324Z * [new branch] gh/karthickai/20/orig -> origin/gh/karthickai/20/orig 2025-12-04T11:11:09.6243402Z * [new branch] gh/karthickai/21/base -> origin/gh/karthickai/21/base 2025-12-04T11:11:09.6243477Z * [new branch] gh/karthickai/21/head -> origin/gh/karthickai/21/head 2025-12-04T11:11:09.6243553Z * [new branch] gh/karthickai/21/orig -> origin/gh/karthickai/21/orig 2025-12-04T11:11:09.6243632Z * [new branch] gh/karthickai/22/base -> origin/gh/karthickai/22/base 2025-12-04T11:11:09.6243706Z * [new branch] gh/karthickai/22/head -> origin/gh/karthickai/22/head 2025-12-04T11:11:09.6243781Z * [new branch] gh/karthickai/22/orig -> origin/gh/karthickai/22/orig 2025-12-04T11:11:09.6243858Z * [new branch] gh/karthickai/23/base -> origin/gh/karthickai/23/base 2025-12-04T11:11:09.6243934Z * [new branch] gh/karthickai/23/head -> origin/gh/karthickai/23/head 2025-12-04T11:11:09.6244009Z * [new branch] gh/karthickai/23/orig -> origin/gh/karthickai/23/orig 2025-12-04T11:11:09.6244088Z * [new branch] gh/karthickai/24/base -> origin/gh/karthickai/24/base 2025-12-04T11:11:09.6244163Z * [new branch] gh/karthickai/24/head -> origin/gh/karthickai/24/head 2025-12-04T11:11:09.6244239Z * [new branch] gh/karthickai/24/orig -> origin/gh/karthickai/24/orig 2025-12-04T11:11:09.6244317Z * [new branch] gh/karthickai/25/base -> origin/gh/karthickai/25/base 2025-12-04T11:11:09.6244392Z * [new branch] gh/karthickai/25/head -> origin/gh/karthickai/25/head 2025-12-04T11:11:09.6244466Z * [new branch] gh/karthickai/25/orig -> origin/gh/karthickai/25/orig 2025-12-04T11:11:09.6244544Z * [new branch] gh/karthickai/26/base -> origin/gh/karthickai/26/base 2025-12-04T11:11:09.6244619Z * [new branch] gh/karthickai/26/head -> origin/gh/karthickai/26/head 2025-12-04T11:11:09.6244698Z * [new branch] gh/karthickai/26/orig -> origin/gh/karthickai/26/orig 2025-12-04T11:11:09.6244771Z * [new branch] gh/karthickai/6/base -> origin/gh/karthickai/6/base 2025-12-04T11:11:09.6244848Z * [new branch] gh/karthickai/6/head -> origin/gh/karthickai/6/head 2025-12-04T11:11:09.6244927Z * [new branch] gh/karthickai/6/orig -> origin/gh/karthickai/6/orig 2025-12-04T11:11:09.6244998Z * [new branch] gh/krocki/1/base -> origin/gh/krocki/1/base 2025-12-04T11:11:09.6245070Z * [new branch] gh/krocki/1/head -> origin/gh/krocki/1/head 2025-12-04T11:11:09.6245143Z * [new branch] gh/krocki/1/orig -> origin/gh/krocki/1/orig 2025-12-04T11:11:09.6245212Z * [new branch] gh/krocki/2/base -> origin/gh/krocki/2/base 2025-12-04T11:11:09.6245280Z * [new branch] gh/krocki/2/head -> origin/gh/krocki/2/head 2025-12-04T11:11:09.6245376Z * [new branch] gh/krocki/2/orig -> origin/gh/krocki/2/orig 2025-12-04T11:11:09.6245460Z * [new branch] gh/kurtamohler/60/base -> origin/gh/kurtamohler/60/base 2025-12-04T11:11:09.6245566Z * [new branch] gh/kurtamohler/60/head -> origin/gh/kurtamohler/60/head 2025-12-04T11:11:09.6245648Z * [new branch] gh/kurtamohler/60/orig -> origin/gh/kurtamohler/60/orig 2025-12-04T11:11:09.6245726Z * [new branch] gh/kurtamohler/61/base -> origin/gh/kurtamohler/61/base 2025-12-04T11:11:09.6245804Z * [new branch] gh/kurtamohler/61/head -> origin/gh/kurtamohler/61/head 2025-12-04T11:11:09.6245884Z * [new branch] gh/kurtamohler/61/orig -> origin/gh/kurtamohler/61/orig 2025-12-04T11:11:09.6245959Z * [new branch] gh/kurtamohler/62/base -> origin/gh/kurtamohler/62/base 2025-12-04T11:11:09.6246035Z * [new branch] gh/kurtamohler/62/head -> origin/gh/kurtamohler/62/head 2025-12-04T11:11:09.6246116Z * [new branch] gh/kurtamohler/62/orig -> origin/gh/kurtamohler/62/orig 2025-12-04T11:11:09.6246194Z * [new branch] gh/kurtamohler/63/base -> origin/gh/kurtamohler/63/base 2025-12-04T11:11:09.6246275Z * [new branch] gh/kurtamohler/63/head -> origin/gh/kurtamohler/63/head 2025-12-04T11:11:09.6246351Z * [new branch] gh/kurtamohler/63/orig -> origin/gh/kurtamohler/63/orig 2025-12-04T11:11:09.6246426Z * [new branch] gh/kurtamohler/64/base -> origin/gh/kurtamohler/64/base 2025-12-04T11:11:09.6246505Z * [new branch] gh/kurtamohler/64/head -> origin/gh/kurtamohler/64/head 2025-12-04T11:11:09.6246581Z * [new branch] gh/kurtamohler/64/orig -> origin/gh/kurtamohler/64/orig 2025-12-04T11:11:09.6246656Z * [new branch] gh/kurtamohler/65/base -> origin/gh/kurtamohler/65/base 2025-12-04T11:11:09.6246737Z * [new branch] gh/kurtamohler/65/head -> origin/gh/kurtamohler/65/head 2025-12-04T11:11:09.6246812Z * [new branch] gh/kurtamohler/65/orig -> origin/gh/kurtamohler/65/orig 2025-12-04T11:11:09.6246888Z * [new branch] gh/kurtamohler/66/base -> origin/gh/kurtamohler/66/base 2025-12-04T11:11:09.6246972Z * [new branch] gh/kurtamohler/66/head -> origin/gh/kurtamohler/66/head 2025-12-04T11:11:09.6247049Z * [new branch] gh/kurtamohler/66/orig -> origin/gh/kurtamohler/66/orig 2025-12-04T11:11:09.6247125Z * [new branch] gh/kurtamohler/67/base -> origin/gh/kurtamohler/67/base 2025-12-04T11:11:09.6247204Z * [new branch] gh/kurtamohler/67/head -> origin/gh/kurtamohler/67/head 2025-12-04T11:11:09.6247280Z * [new branch] gh/kurtamohler/67/orig -> origin/gh/kurtamohler/67/orig 2025-12-04T11:11:09.6247352Z * [new branch] gh/kwen2501/130/base -> origin/gh/kwen2501/130/base 2025-12-04T11:11:09.6247428Z * [new branch] gh/kwen2501/130/head -> origin/gh/kwen2501/130/head 2025-12-04T11:11:09.6247501Z * [new branch] gh/kwen2501/130/orig -> origin/gh/kwen2501/130/orig 2025-12-04T11:11:09.6247572Z * [new branch] gh/kwen2501/170/base -> origin/gh/kwen2501/170/base 2025-12-04T11:11:09.6247647Z * [new branch] gh/kwen2501/170/head -> origin/gh/kwen2501/170/head 2025-12-04T11:11:09.6247716Z * [new branch] gh/kwen2501/187/base -> origin/gh/kwen2501/187/base 2025-12-04T11:11:09.6247785Z * [new branch] gh/kwen2501/187/head -> origin/gh/kwen2501/187/head 2025-12-04T11:11:09.6247858Z * [new branch] gh/kwen2501/187/orig -> origin/gh/kwen2501/187/orig 2025-12-04T11:11:09.6247928Z * [new branch] gh/kwen2501/188/base -> origin/gh/kwen2501/188/base 2025-12-04T11:11:09.6248002Z * [new branch] gh/kwen2501/188/head -> origin/gh/kwen2501/188/head 2025-12-04T11:11:09.6248094Z * [new branch] gh/kwen2501/188/orig -> origin/gh/kwen2501/188/orig 2025-12-04T11:11:09.6248200Z * [new branch] gh/kwen2501/211/base -> origin/gh/kwen2501/211/base 2025-12-04T11:11:09.6248275Z * [new branch] gh/kwen2501/211/head -> origin/gh/kwen2501/211/head 2025-12-04T11:11:09.6248382Z * [new branch] gh/kwen2501/224/base -> origin/gh/kwen2501/224/base 2025-12-04T11:11:09.6248452Z * [new branch] gh/kwen2501/224/head -> origin/gh/kwen2501/224/head 2025-12-04T11:11:09.6248526Z * [new branch] gh/kwen2501/224/orig -> origin/gh/kwen2501/224/orig 2025-12-04T11:11:09.6248595Z * [new branch] gh/kwen2501/228/base -> origin/gh/kwen2501/228/base 2025-12-04T11:11:09.6248666Z * [new branch] gh/kwen2501/228/head -> origin/gh/kwen2501/228/head 2025-12-04T11:11:09.6248739Z * [new branch] gh/kwen2501/228/orig -> origin/gh/kwen2501/228/orig 2025-12-04T11:11:09.6248811Z * [new branch] gh/kwen2501/234/base -> origin/gh/kwen2501/234/base 2025-12-04T11:11:09.6248881Z * [new branch] gh/kwen2501/234/head -> origin/gh/kwen2501/234/head 2025-12-04T11:11:09.6248955Z * [new branch] gh/kwen2501/234/orig -> origin/gh/kwen2501/234/orig 2025-12-04T11:11:09.6249025Z * [new branch] gh/kwen2501/235/base -> origin/gh/kwen2501/235/base 2025-12-04T11:11:09.6249095Z * [new branch] gh/kwen2501/235/head -> origin/gh/kwen2501/235/head 2025-12-04T11:11:09.6249169Z * [new branch] gh/kwen2501/235/orig -> origin/gh/kwen2501/235/orig 2025-12-04T11:11:09.6249239Z * [new branch] gh/kwen2501/236/base -> origin/gh/kwen2501/236/base 2025-12-04T11:11:09.6249310Z * [new branch] gh/kwen2501/236/head -> origin/gh/kwen2501/236/head 2025-12-04T11:11:09.6249385Z * [new branch] gh/kwen2501/236/orig -> origin/gh/kwen2501/236/orig 2025-12-04T11:11:09.6249456Z * [new branch] gh/kwen2501/237/base -> origin/gh/kwen2501/237/base 2025-12-04T11:11:09.6249529Z * [new branch] gh/kwen2501/237/head -> origin/gh/kwen2501/237/head 2025-12-04T11:11:09.6249602Z * [new branch] gh/kwen2501/237/orig -> origin/gh/kwen2501/237/orig 2025-12-04T11:11:09.6249672Z * [new branch] gh/kwen2501/238/base -> origin/gh/kwen2501/238/base 2025-12-04T11:11:09.6249744Z * [new branch] gh/kwen2501/238/head -> origin/gh/kwen2501/238/head 2025-12-04T11:11:09.6249815Z * [new branch] gh/kwen2501/238/orig -> origin/gh/kwen2501/238/orig 2025-12-04T11:11:09.6249886Z * [new branch] gh/kwen2501/240/base -> origin/gh/kwen2501/240/base 2025-12-04T11:11:09.6249960Z * [new branch] gh/kwen2501/240/head -> origin/gh/kwen2501/240/head 2025-12-04T11:11:09.6250032Z * [new branch] gh/kwen2501/240/orig -> origin/gh/kwen2501/240/orig 2025-12-04T11:11:09.6250105Z * [new branch] gh/kwen2501/241/base -> origin/gh/kwen2501/241/base 2025-12-04T11:11:09.6250179Z * [new branch] gh/kwen2501/241/head -> origin/gh/kwen2501/241/head 2025-12-04T11:11:09.6250250Z * [new branch] gh/kwen2501/241/orig -> origin/gh/kwen2501/241/orig 2025-12-04T11:11:09.6250321Z * [new branch] gh/kwen2501/247/base -> origin/gh/kwen2501/247/base 2025-12-04T11:11:09.6250396Z * [new branch] gh/kwen2501/247/head -> origin/gh/kwen2501/247/head 2025-12-04T11:11:09.6250468Z * [new branch] gh/kwen2501/247/orig -> origin/gh/kwen2501/247/orig 2025-12-04T11:11:09.6250539Z * [new branch] gh/kwen2501/252/base -> origin/gh/kwen2501/252/base 2025-12-04T11:11:09.6250613Z * [new branch] gh/kwen2501/252/head -> origin/gh/kwen2501/252/head 2025-12-04T11:11:09.6250684Z * [new branch] gh/kwen2501/252/orig -> origin/gh/kwen2501/252/orig 2025-12-04T11:11:09.6250782Z * [new branch] gh/kwen2501/259/base -> origin/gh/kwen2501/259/base 2025-12-04T11:11:09.6250857Z * [new branch] gh/kwen2501/259/head -> origin/gh/kwen2501/259/head 2025-12-04T11:11:09.6250948Z * [new branch] gh/kwen2501/259/orig -> origin/gh/kwen2501/259/orig 2025-12-04T11:11:09.6251022Z * [new branch] gh/kwen2501/260/base -> origin/gh/kwen2501/260/base 2025-12-04T11:11:09.6251094Z * [new branch] gh/kwen2501/260/head -> origin/gh/kwen2501/260/head 2025-12-04T11:11:09.6251165Z * [new branch] gh/kwen2501/260/orig -> origin/gh/kwen2501/260/orig 2025-12-04T11:11:09.6251238Z * [new branch] gh/kwen2501/268/base -> origin/gh/kwen2501/268/base 2025-12-04T11:11:09.6251309Z * [new branch] gh/kwen2501/268/head -> origin/gh/kwen2501/268/head 2025-12-04T11:11:09.6251378Z * [new branch] gh/kwen2501/268/orig -> origin/gh/kwen2501/268/orig 2025-12-04T11:11:09.6251451Z * [new branch] gh/kwen2501/269/base -> origin/gh/kwen2501/269/base 2025-12-04T11:11:09.6251522Z * [new branch] gh/kwen2501/269/head -> origin/gh/kwen2501/269/head 2025-12-04T11:11:09.6251593Z * [new branch] gh/kwen2501/269/orig -> origin/gh/kwen2501/269/orig 2025-12-04T11:11:09.6251668Z * [new branch] gh/kwen2501/270/base -> origin/gh/kwen2501/270/base 2025-12-04T11:11:09.6251738Z * [new branch] gh/kwen2501/270/head -> origin/gh/kwen2501/270/head 2025-12-04T11:11:09.6251808Z * [new branch] gh/kwen2501/270/orig -> origin/gh/kwen2501/270/orig 2025-12-04T11:11:09.6251882Z * [new branch] gh/kwen2501/271/base -> origin/gh/kwen2501/271/base 2025-12-04T11:11:09.6251953Z * [new branch] gh/kwen2501/271/head -> origin/gh/kwen2501/271/head 2025-12-04T11:11:09.6252024Z * [new branch] gh/kwen2501/271/orig -> origin/gh/kwen2501/271/orig 2025-12-04T11:11:09.6252100Z * [new branch] gh/kwen2501/274/base -> origin/gh/kwen2501/274/base 2025-12-04T11:11:09.6252172Z * [new branch] gh/kwen2501/274/head -> origin/gh/kwen2501/274/head 2025-12-04T11:11:09.6252244Z * [new branch] gh/kwen2501/274/orig -> origin/gh/kwen2501/274/orig 2025-12-04T11:11:09.6252320Z * [new branch] gh/kwen2501/275/base -> origin/gh/kwen2501/275/base 2025-12-04T11:11:09.6252391Z * [new branch] gh/kwen2501/275/head -> origin/gh/kwen2501/275/head 2025-12-04T11:11:09.6252462Z * [new branch] gh/kwen2501/275/orig -> origin/gh/kwen2501/275/orig 2025-12-04T11:11:09.6252536Z * [new branch] gh/kwen2501/276/base -> origin/gh/kwen2501/276/base 2025-12-04T11:11:09.6252607Z * [new branch] gh/kwen2501/276/head -> origin/gh/kwen2501/276/head 2025-12-04T11:11:09.6252681Z * [new branch] gh/kwen2501/276/orig -> origin/gh/kwen2501/276/orig 2025-12-04T11:11:09.6252752Z * [new branch] gh/kwen2501/277/base -> origin/gh/kwen2501/277/base 2025-12-04T11:11:09.6252825Z * [new branch] gh/kwen2501/277/head -> origin/gh/kwen2501/277/head 2025-12-04T11:11:09.6252902Z * [new branch] gh/kwen2501/277/orig -> origin/gh/kwen2501/277/orig 2025-12-04T11:11:09.6252972Z * [new branch] gh/kwen2501/278/base -> origin/gh/kwen2501/278/base 2025-12-04T11:11:09.6253044Z * [new branch] gh/kwen2501/278/head -> origin/gh/kwen2501/278/head 2025-12-04T11:11:09.6253120Z * [new branch] gh/kwen2501/278/orig -> origin/gh/kwen2501/278/orig 2025-12-04T11:11:09.6253191Z * [new branch] gh/kwen2501/279/base -> origin/gh/kwen2501/279/base 2025-12-04T11:11:09.6253263Z * [new branch] gh/kwen2501/279/head -> origin/gh/kwen2501/279/head 2025-12-04T11:11:09.6253368Z * [new branch] gh/kwen2501/279/orig -> origin/gh/kwen2501/279/orig 2025-12-04T11:11:09.6253439Z * [new branch] gh/kwen2501/280/base -> origin/gh/kwen2501/280/base 2025-12-04T11:11:09.6253511Z * [new branch] gh/kwen2501/280/head -> origin/gh/kwen2501/280/head 2025-12-04T11:11:09.6253612Z * [new branch] gh/kwen2501/280/orig -> origin/gh/kwen2501/280/orig 2025-12-04T11:11:09.6253683Z * [new branch] gh/kwen2501/281/base -> origin/gh/kwen2501/281/base 2025-12-04T11:11:09.6253753Z * [new branch] gh/kwen2501/281/head -> origin/gh/kwen2501/281/head 2025-12-04T11:11:09.6253827Z * [new branch] gh/kwen2501/281/orig -> origin/gh/kwen2501/281/orig 2025-12-04T11:11:09.6253899Z * [new branch] gh/kwen2501/282/base -> origin/gh/kwen2501/282/base 2025-12-04T11:11:09.6253970Z * [new branch] gh/kwen2501/282/head -> origin/gh/kwen2501/282/head 2025-12-04T11:11:09.6254046Z * [new branch] gh/kwen2501/282/orig -> origin/gh/kwen2501/282/orig 2025-12-04T11:11:09.6254118Z * [new branch] gh/kwen2501/283/base -> origin/gh/kwen2501/283/base 2025-12-04T11:11:09.6254192Z * [new branch] gh/kwen2501/283/head -> origin/gh/kwen2501/283/head 2025-12-04T11:11:09.6254264Z * [new branch] gh/kwen2501/283/orig -> origin/gh/kwen2501/283/orig 2025-12-04T11:11:09.6254335Z * [new branch] gh/kwen2501/284/base -> origin/gh/kwen2501/284/base 2025-12-04T11:11:09.6254410Z * [new branch] gh/kwen2501/284/head -> origin/gh/kwen2501/284/head 2025-12-04T11:11:09.6254482Z * [new branch] gh/kwen2501/284/orig -> origin/gh/kwen2501/284/orig 2025-12-04T11:11:09.6254553Z * [new branch] gh/kwen2501/285/base -> origin/gh/kwen2501/285/base 2025-12-04T11:11:09.6254627Z * [new branch] gh/kwen2501/285/head -> origin/gh/kwen2501/285/head 2025-12-04T11:11:09.6254699Z * [new branch] gh/kwen2501/285/orig -> origin/gh/kwen2501/285/orig 2025-12-04T11:11:09.6254770Z * [new branch] gh/kwen2501/286/base -> origin/gh/kwen2501/286/base 2025-12-04T11:11:09.6254841Z * [new branch] gh/kwen2501/286/head -> origin/gh/kwen2501/286/head 2025-12-04T11:11:09.6254913Z * [new branch] gh/kwen2501/286/orig -> origin/gh/kwen2501/286/orig 2025-12-04T11:11:09.6254984Z * [new branch] gh/kwen2501/287/base -> origin/gh/kwen2501/287/base 2025-12-04T11:11:09.6255059Z * [new branch] gh/kwen2501/287/head -> origin/gh/kwen2501/287/head 2025-12-04T11:11:09.6255131Z * [new branch] gh/kwen2501/287/orig -> origin/gh/kwen2501/287/orig 2025-12-04T11:11:09.6255202Z * [new branch] gh/kwen2501/288/base -> origin/gh/kwen2501/288/base 2025-12-04T11:11:09.6255277Z * [new branch] gh/kwen2501/288/head -> origin/gh/kwen2501/288/head 2025-12-04T11:11:09.6255349Z * [new branch] gh/kwen2501/288/orig -> origin/gh/kwen2501/288/orig 2025-12-04T11:11:09.6255427Z * [new branch] gh/laithsakka/251/base -> origin/gh/laithsakka/251/base 2025-12-04T11:11:09.6255508Z * [new branch] gh/laithsakka/251/head -> origin/gh/laithsakka/251/head 2025-12-04T11:11:09.6255585Z * [new branch] gh/laithsakka/251/orig -> origin/gh/laithsakka/251/orig 2025-12-04T11:11:09.6255663Z * [new branch] gh/laithsakka/276/base -> origin/gh/laithsakka/276/base 2025-12-04T11:11:09.6255738Z * [new branch] gh/laithsakka/276/head -> origin/gh/laithsakka/276/head 2025-12-04T11:11:09.6255815Z * [new branch] gh/laithsakka/276/orig -> origin/gh/laithsakka/276/orig 2025-12-04T11:11:09.6255893Z * [new branch] gh/laithsakka/28/base -> origin/gh/laithsakka/28/base 2025-12-04T11:11:09.6255969Z * [new branch] gh/laithsakka/29/base -> origin/gh/laithsakka/29/base 2025-12-04T11:11:09.6256066Z * [new branch] gh/laithsakka/30/base -> origin/gh/laithsakka/30/base 2025-12-04T11:11:09.6256146Z * [new branch] gh/laithsakka/30/head -> origin/gh/laithsakka/30/head 2025-12-04T11:11:09.6256240Z * [new branch] gh/laithsakka/31/base -> origin/gh/laithsakka/31/base 2025-12-04T11:11:09.6256315Z * [new branch] gh/laithsakka/31/head -> origin/gh/laithsakka/31/head 2025-12-04T11:11:09.6256397Z * [new branch] gh/laithsakka/313/base -> origin/gh/laithsakka/313/base 2025-12-04T11:11:09.6256473Z * [new branch] gh/laithsakka/313/head -> origin/gh/laithsakka/313/head 2025-12-04T11:11:09.6256548Z * [new branch] gh/laithsakka/313/orig -> origin/gh/laithsakka/313/orig 2025-12-04T11:11:09.6256629Z * [new branch] gh/laithsakka/316/base -> origin/gh/laithsakka/316/base 2025-12-04T11:11:09.6256703Z * [new branch] gh/laithsakka/316/head -> origin/gh/laithsakka/316/head 2025-12-04T11:11:09.6256780Z * [new branch] gh/laithsakka/316/orig -> origin/gh/laithsakka/316/orig 2025-12-04T11:11:09.6256859Z * [new branch] gh/laithsakka/317/base -> origin/gh/laithsakka/317/base 2025-12-04T11:11:09.6256936Z * [new branch] gh/laithsakka/317/head -> origin/gh/laithsakka/317/head 2025-12-04T11:11:09.6257011Z * [new branch] gh/laithsakka/317/orig -> origin/gh/laithsakka/317/orig 2025-12-04T11:11:09.6257088Z * [new branch] gh/laithsakka/319/base -> origin/gh/laithsakka/319/base 2025-12-04T11:11:09.6257164Z * [new branch] gh/laithsakka/319/head -> origin/gh/laithsakka/319/head 2025-12-04T11:11:09.6257244Z * [new branch] gh/laithsakka/319/orig -> origin/gh/laithsakka/319/orig 2025-12-04T11:11:09.6257319Z * [new branch] gh/laithsakka/32/base -> origin/gh/laithsakka/32/base 2025-12-04T11:11:09.6257397Z * [new branch] gh/laithsakka/32/head -> origin/gh/laithsakka/32/head 2025-12-04T11:11:09.6257477Z * [new branch] gh/laithsakka/320/base -> origin/gh/laithsakka/320/base 2025-12-04T11:11:09.6257552Z * [new branch] gh/laithsakka/320/head -> origin/gh/laithsakka/320/head 2025-12-04T11:11:09.6257629Z * [new branch] gh/laithsakka/320/orig -> origin/gh/laithsakka/320/orig 2025-12-04T11:11:09.6257708Z * [new branch] gh/laithsakka/321/base -> origin/gh/laithsakka/321/base 2025-12-04T11:11:09.6257784Z * [new branch] gh/laithsakka/321/head -> origin/gh/laithsakka/321/head 2025-12-04T11:11:09.6257859Z * [new branch] gh/laithsakka/321/orig -> origin/gh/laithsakka/321/orig 2025-12-04T11:11:09.6257937Z * [new branch] gh/laithsakka/322/base -> origin/gh/laithsakka/322/base 2025-12-04T11:11:09.6258012Z * [new branch] gh/laithsakka/322/head -> origin/gh/laithsakka/322/head 2025-12-04T11:11:09.6258089Z * [new branch] gh/laithsakka/322/orig -> origin/gh/laithsakka/322/orig 2025-12-04T11:11:09.6258211Z * [new branch] gh/laithsakka/323/base -> origin/gh/laithsakka/323/base 2025-12-04T11:11:09.6258287Z * [new branch] gh/laithsakka/323/head -> origin/gh/laithsakka/323/head 2025-12-04T11:11:09.6258363Z * [new branch] gh/laithsakka/323/orig -> origin/gh/laithsakka/323/orig 2025-12-04T11:11:09.6258440Z * [new branch] gh/laithsakka/324/base -> origin/gh/laithsakka/324/base 2025-12-04T11:11:09.6258515Z * [new branch] gh/laithsakka/324/head -> origin/gh/laithsakka/324/head 2025-12-04T11:11:09.6258589Z * [new branch] gh/laithsakka/324/orig -> origin/gh/laithsakka/324/orig 2025-12-04T11:11:09.6258664Z * [new branch] gh/laithsakka/325/base -> origin/gh/laithsakka/325/base 2025-12-04T11:11:09.6258739Z * [new branch] gh/laithsakka/325/head -> origin/gh/laithsakka/325/head 2025-12-04T11:11:09.6258840Z * [new branch] gh/laithsakka/325/orig -> origin/gh/laithsakka/325/orig 2025-12-04T11:11:09.6258919Z * [new branch] gh/laithsakka/326/base -> origin/gh/laithsakka/326/base 2025-12-04T11:11:09.6259024Z * [new branch] gh/laithsakka/326/head -> origin/gh/laithsakka/326/head 2025-12-04T11:11:09.6259102Z * [new branch] gh/laithsakka/326/orig -> origin/gh/laithsakka/326/orig 2025-12-04T11:11:09.6259176Z * [new branch] gh/laithsakka/327/base -> origin/gh/laithsakka/327/base 2025-12-04T11:11:09.6259251Z * [new branch] gh/laithsakka/327/head -> origin/gh/laithsakka/327/head 2025-12-04T11:11:09.6259330Z * [new branch] gh/laithsakka/327/orig -> origin/gh/laithsakka/327/orig 2025-12-04T11:11:09.6259404Z * [new branch] gh/laithsakka/328/base -> origin/gh/laithsakka/328/base 2025-12-04T11:11:09.6259478Z * [new branch] gh/laithsakka/328/head -> origin/gh/laithsakka/328/head 2025-12-04T11:11:09.6259558Z * [new branch] gh/laithsakka/328/orig -> origin/gh/laithsakka/328/orig 2025-12-04T11:11:09.6259630Z * [new branch] gh/liangel/4/base -> origin/gh/liangel/4/base 2025-12-04T11:11:09.6259704Z * [new branch] gh/liangel/4/head -> origin/gh/liangel/4/head 2025-12-04T11:11:09.6259777Z * [new branch] gh/liangel/4/orig -> origin/gh/liangel/4/orig 2025-12-04T11:11:09.6259856Z * [new branch] gh/lucaskabela/1/base -> origin/gh/lucaskabela/1/base 2025-12-04T11:11:09.6259935Z * [new branch] gh/lucaskabela/1/head -> origin/gh/lucaskabela/1/head 2025-12-04T11:11:09.6260007Z * [new branch] gh/lw/4/base -> origin/gh/lw/4/base 2025-12-04T11:11:09.6260073Z * [new branch] gh/lw/4/head -> origin/gh/lw/4/head 2025-12-04T11:11:09.6260138Z * [new branch] gh/lw/4/orig -> origin/gh/lw/4/orig 2025-12-04T11:11:09.6260206Z * [new branch] gh/lw/5/base -> origin/gh/lw/5/base 2025-12-04T11:11:09.6260270Z * [new branch] gh/lw/5/head -> origin/gh/lw/5/head 2025-12-04T11:11:09.6260332Z * [new branch] gh/lw/5/orig -> origin/gh/lw/5/orig 2025-12-04T11:11:09.6260399Z * [new branch] gh/lw/6/base -> origin/gh/lw/6/base 2025-12-04T11:11:09.6260464Z * [new branch] gh/lw/6/head -> origin/gh/lw/6/head 2025-12-04T11:11:09.6260528Z * [new branch] gh/lw/6/orig -> origin/gh/lw/6/orig 2025-12-04T11:11:09.6260601Z * [new branch] gh/malfet/14/base -> origin/gh/malfet/14/base 2025-12-04T11:11:09.6260674Z * [new branch] gh/malfet/417/base -> origin/gh/malfet/417/base 2025-12-04T11:11:09.6260749Z * [new branch] gh/malfet/417/head -> origin/gh/malfet/417/head 2025-12-04T11:11:09.6260823Z * [new branch] gh/malfet/417/orig -> origin/gh/malfet/417/orig 2025-12-04T11:11:09.6260893Z * [new branch] gh/malfet/506/base -> origin/gh/malfet/506/base 2025-12-04T11:11:09.6260967Z * [new branch] gh/malfet/506/head -> origin/gh/malfet/506/head 2025-12-04T11:11:09.6261037Z * [new branch] gh/malfet/506/orig -> origin/gh/malfet/506/orig 2025-12-04T11:11:09.6261107Z * [new branch] gh/malfet/517/base -> origin/gh/malfet/517/base 2025-12-04T11:11:09.6261180Z * [new branch] gh/malfet/517/head -> origin/gh/malfet/517/head 2025-12-04T11:11:09.6261251Z * [new branch] gh/malfet/528/base -> origin/gh/malfet/528/base 2025-12-04T11:11:09.6261320Z * [new branch] gh/malfet/528/head -> origin/gh/malfet/528/head 2025-12-04T11:11:09.6261392Z * [new branch] gh/malfet/528/orig -> origin/gh/malfet/528/orig 2025-12-04T11:11:09.6261870Z * [new branch] gh/malfet/537/base -> origin/gh/malfet/537/base 2025-12-04T11:11:09.6261943Z * [new branch] gh/malfet/537/head -> origin/gh/malfet/537/head 2025-12-04T11:11:09.6262017Z * [new branch] gh/malfet/537/orig -> origin/gh/malfet/537/orig 2025-12-04T11:11:09.6262107Z * [new branch] gh/malfet/546/base -> origin/gh/malfet/546/base 2025-12-04T11:11:09.6262177Z * [new branch] gh/malfet/546/head -> origin/gh/malfet/546/head 2025-12-04T11:11:09.6262249Z * [new branch] gh/malfet/546/orig -> origin/gh/malfet/546/orig 2025-12-04T11:11:09.6262319Z * [new branch] gh/malfet/565/base -> origin/gh/malfet/565/base 2025-12-04T11:11:09.6262387Z * [new branch] gh/malfet/565/head -> origin/gh/malfet/565/head 2025-12-04T11:11:09.6262460Z * [new branch] gh/malfet/565/orig -> origin/gh/malfet/565/orig 2025-12-04T11:11:09.6262531Z * [new branch] gh/malfet/575/base -> origin/gh/malfet/575/base 2025-12-04T11:11:09.6262600Z * [new branch] gh/malfet/575/head -> origin/gh/malfet/575/head 2025-12-04T11:11:09.6262674Z * [new branch] gh/malfet/575/orig -> origin/gh/malfet/575/orig 2025-12-04T11:11:09.6262745Z * [new branch] gh/malfet/580/base -> origin/gh/malfet/580/base 2025-12-04T11:11:09.6262818Z * [new branch] gh/malfet/580/head -> origin/gh/malfet/580/head 2025-12-04T11:11:09.6262887Z * [new branch] gh/malfet/580/orig -> origin/gh/malfet/580/orig 2025-12-04T11:11:09.6262955Z * [new branch] gh/malfet/581/base -> origin/gh/malfet/581/base 2025-12-04T11:11:09.6263027Z * [new branch] gh/malfet/581/head -> origin/gh/malfet/581/head 2025-12-04T11:11:09.6263097Z * [new branch] gh/malfet/581/orig -> origin/gh/malfet/581/orig 2025-12-04T11:11:09.6263167Z * [new branch] gh/malfet/583/base -> origin/gh/malfet/583/base 2025-12-04T11:11:09.6263239Z * [new branch] gh/malfet/583/head -> origin/gh/malfet/583/head 2025-12-04T11:11:09.6263308Z * [new branch] gh/malfet/583/orig -> origin/gh/malfet/583/orig 2025-12-04T11:11:09.6263379Z * [new branch] gh/malfet/586/base -> origin/gh/malfet/586/base 2025-12-04T11:11:09.6263452Z * [new branch] gh/malfet/586/head -> origin/gh/malfet/586/head 2025-12-04T11:11:09.6263520Z * [new branch] gh/malfet/586/orig -> origin/gh/malfet/586/orig 2025-12-04T11:11:09.6263589Z * [new branch] gh/malfet/587/base -> origin/gh/malfet/587/base 2025-12-04T11:11:09.6263662Z * [new branch] gh/malfet/587/head -> origin/gh/malfet/587/head 2025-12-04T11:11:09.6263730Z * [new branch] gh/malfet/587/orig -> origin/gh/malfet/587/orig 2025-12-04T11:11:09.6263799Z * [new branch] gh/malfet/588/base -> origin/gh/malfet/588/base 2025-12-04T11:11:09.6263874Z * [new branch] gh/malfet/588/head -> origin/gh/malfet/588/head 2025-12-04T11:11:09.6263944Z * [new branch] gh/malfet/588/orig -> origin/gh/malfet/588/orig 2025-12-04T11:11:09.6264015Z * [new branch] gh/malfet/589/base -> origin/gh/malfet/589/base 2025-12-04T11:11:09.6264088Z * [new branch] gh/malfet/589/head -> origin/gh/malfet/589/head 2025-12-04T11:11:09.6264158Z * [new branch] gh/malfet/589/orig -> origin/gh/malfet/589/orig 2025-12-04T11:11:09.6264228Z * [new branch] gh/malfet/590/base -> origin/gh/malfet/590/base 2025-12-04T11:11:09.6264301Z * [new branch] gh/malfet/590/head -> origin/gh/malfet/590/head 2025-12-04T11:11:09.6264370Z * [new branch] gh/malfet/590/orig -> origin/gh/malfet/590/orig 2025-12-04T11:11:09.6264440Z * [new branch] gh/malfet/591/base -> origin/gh/malfet/591/base 2025-12-04T11:11:09.6264538Z * [new branch] gh/malfet/591/head -> origin/gh/malfet/591/head 2025-12-04T11:11:09.6264608Z * [new branch] gh/malfet/591/orig -> origin/gh/malfet/591/orig 2025-12-04T11:11:09.6264703Z * [new branch] gh/malfet/592/base -> origin/gh/malfet/592/base 2025-12-04T11:11:09.6264773Z * [new branch] gh/malfet/592/head -> origin/gh/malfet/592/head 2025-12-04T11:11:09.6264842Z * [new branch] gh/malfet/592/orig -> origin/gh/malfet/592/orig 2025-12-04T11:11:09.6264914Z * [new branch] gh/malfet/593/base -> origin/gh/malfet/593/base 2025-12-04T11:11:09.6264983Z * [new branch] gh/malfet/593/head -> origin/gh/malfet/593/head 2025-12-04T11:11:09.6265051Z * [new branch] gh/malfet/593/orig -> origin/gh/malfet/593/orig 2025-12-04T11:11:09.6265126Z * [new branch] gh/malfet/594/base -> origin/gh/malfet/594/base 2025-12-04T11:11:09.6265196Z * [new branch] gh/malfet/594/head -> origin/gh/malfet/594/head 2025-12-04T11:11:09.6265265Z * [new branch] gh/malfet/594/orig -> origin/gh/malfet/594/orig 2025-12-04T11:11:09.6265338Z * [new branch] gh/malfet/595/base -> origin/gh/malfet/595/base 2025-12-04T11:11:09.6265407Z * [new branch] gh/malfet/595/head -> origin/gh/malfet/595/head 2025-12-04T11:11:09.6265476Z * [new branch] gh/malfet/595/orig -> origin/gh/malfet/595/orig 2025-12-04T11:11:09.6265549Z * [new branch] gh/malfet/596/base -> origin/gh/malfet/596/base 2025-12-04T11:11:09.6265620Z * [new branch] gh/malfet/596/head -> origin/gh/malfet/596/head 2025-12-04T11:11:09.6265688Z * [new branch] gh/malfet/596/orig -> origin/gh/malfet/596/orig 2025-12-04T11:11:09.6265760Z * [new branch] gh/malfet/597/base -> origin/gh/malfet/597/base 2025-12-04T11:11:09.6265831Z * [new branch] gh/malfet/597/head -> origin/gh/malfet/597/head 2025-12-04T11:11:09.6265900Z * [new branch] gh/malfet/597/orig -> origin/gh/malfet/597/orig 2025-12-04T11:11:09.6265974Z * [new branch] gh/malfet/598/base -> origin/gh/malfet/598/base 2025-12-04T11:11:09.6266044Z * [new branch] gh/malfet/598/head -> origin/gh/malfet/598/head 2025-12-04T11:11:09.6266113Z * [new branch] gh/malfet/598/orig -> origin/gh/malfet/598/orig 2025-12-04T11:11:09.6266185Z * [new branch] gh/malfet/599/base -> origin/gh/malfet/599/base 2025-12-04T11:11:09.6266253Z * [new branch] gh/malfet/599/head -> origin/gh/malfet/599/head 2025-12-04T11:11:09.6266326Z * [new branch] gh/malfet/599/orig -> origin/gh/malfet/599/orig 2025-12-04T11:11:09.6266396Z * [new branch] gh/malfet/600/base -> origin/gh/malfet/600/base 2025-12-04T11:11:09.6266467Z * [new branch] gh/malfet/600/head -> origin/gh/malfet/600/head 2025-12-04T11:11:09.6266539Z * [new branch] gh/malfet/600/orig -> origin/gh/malfet/600/orig 2025-12-04T11:11:09.6266610Z * [new branch] gh/malfet/601/base -> origin/gh/malfet/601/base 2025-12-04T11:11:09.6266680Z * [new branch] gh/malfet/601/head -> origin/gh/malfet/601/head 2025-12-04T11:11:09.6266753Z * [new branch] gh/malfet/601/orig -> origin/gh/malfet/601/orig 2025-12-04T11:11:09.6266822Z * [new branch] gh/malfet/602/base -> origin/gh/malfet/602/base 2025-12-04T11:11:09.6266892Z * [new branch] gh/malfet/602/head -> origin/gh/malfet/602/head 2025-12-04T11:11:09.6266968Z * [new branch] gh/malfet/602/orig -> origin/gh/malfet/602/orig 2025-12-04T11:11:09.6267037Z * [new branch] gh/malfet/603/base -> origin/gh/malfet/603/base 2025-12-04T11:11:09.6267128Z * [new branch] gh/malfet/603/head -> origin/gh/malfet/603/head 2025-12-04T11:11:09.6278363Z * [new branch] gh/malfet/603/orig -> origin/gh/malfet/603/orig 2025-12-04T11:11:09.6278508Z * [new branch] gh/malfet/604/base -> origin/gh/malfet/604/base 2025-12-04T11:11:09.6278584Z * [new branch] gh/malfet/604/head -> origin/gh/malfet/604/head 2025-12-04T11:11:09.6278663Z * [new branch] gh/malfet/604/orig -> origin/gh/malfet/604/orig 2025-12-04T11:11:09.6278734Z * [new branch] gh/malfet/605/base -> origin/gh/malfet/605/base 2025-12-04T11:11:09.6278805Z * [new branch] gh/malfet/605/head -> origin/gh/malfet/605/head 2025-12-04T11:11:09.6278884Z * [new branch] gh/malfet/605/orig -> origin/gh/malfet/605/orig 2025-12-04T11:11:09.6278958Z * [new branch] gh/malfet/606/base -> origin/gh/malfet/606/base 2025-12-04T11:11:09.6279032Z * [new branch] gh/malfet/606/head -> origin/gh/malfet/606/head 2025-12-04T11:11:09.6279109Z * [new branch] gh/malfet/606/orig -> origin/gh/malfet/606/orig 2025-12-04T11:11:09.6279185Z * [new branch] gh/malfet/607/base -> origin/gh/malfet/607/base 2025-12-04T11:11:09.6279255Z * [new branch] gh/malfet/607/head -> origin/gh/malfet/607/head 2025-12-04T11:11:09.6279335Z * [new branch] gh/malfet/607/orig -> origin/gh/malfet/607/orig 2025-12-04T11:11:09.6279406Z * [new branch] gh/malfet/608/base -> origin/gh/malfet/608/base 2025-12-04T11:11:09.6279485Z * [new branch] gh/malfet/608/head -> origin/gh/malfet/608/head 2025-12-04T11:11:09.6279557Z * [new branch] gh/malfet/608/orig -> origin/gh/malfet/608/orig 2025-12-04T11:11:09.6279627Z * [new branch] gh/malfet/609/base -> origin/gh/malfet/609/base 2025-12-04T11:11:09.6279702Z * [new branch] gh/malfet/609/head -> origin/gh/malfet/609/head 2025-12-04T11:11:09.6279775Z * [new branch] gh/malfet/609/orig -> origin/gh/malfet/609/orig 2025-12-04T11:11:09.6279855Z * [new branch] gh/malfet/610/base -> origin/gh/malfet/610/base 2025-12-04T11:11:09.6279930Z * [new branch] gh/malfet/610/head -> origin/gh/malfet/610/head 2025-12-04T11:11:09.6280003Z * [new branch] gh/malfet/610/orig -> origin/gh/malfet/610/orig 2025-12-04T11:11:09.6280074Z * [new branch] gh/malfet/611/base -> origin/gh/malfet/611/base 2025-12-04T11:11:09.6280149Z * [new branch] gh/malfet/611/head -> origin/gh/malfet/611/head 2025-12-04T11:11:09.6280220Z * [new branch] gh/malfet/611/orig -> origin/gh/malfet/611/orig 2025-12-04T11:11:09.6280290Z * [new branch] gh/malfet/612/base -> origin/gh/malfet/612/base 2025-12-04T11:11:09.6280367Z * [new branch] gh/malfet/612/head -> origin/gh/malfet/612/head 2025-12-04T11:11:09.6280438Z * [new branch] gh/malfet/612/orig -> origin/gh/malfet/612/orig 2025-12-04T11:11:09.6280513Z * [new branch] gh/malfet/64/base -> origin/gh/malfet/64/base 2025-12-04T11:11:09.6280596Z * [new branch] gh/malfet/64/head -> origin/gh/malfet/64/head 2025-12-04T11:11:09.6280692Z * [new branch] gh/manuelcandales/11/base -> origin/gh/manuelcandales/11/base 2025-12-04T11:11:09.6280781Z * [new branch] gh/manuelcandales/11/head -> origin/gh/manuelcandales/11/head 2025-12-04T11:11:09.6280872Z * [new branch] gh/manuelcandales/11/orig -> origin/gh/manuelcandales/11/orig 2025-12-04T11:11:09.6280949Z * [new branch] gh/markkm/1/base -> origin/gh/markkm/1/base 2025-12-04T11:11:09.6281027Z * [new branch] gh/masnesral/1/base -> origin/gh/masnesral/1/base 2025-12-04T11:11:09.6281139Z * [new branch] gh/masnesral/1/head -> origin/gh/masnesral/1/head 2025-12-04T11:11:09.6281217Z * [new branch] gh/masnesral/1/orig -> origin/gh/masnesral/1/orig 2025-12-04T11:11:09.6281321Z * [new branch] gh/mhorowitz/0/base -> origin/gh/mhorowitz/0/base 2025-12-04T11:11:09.6281395Z * [new branch] gh/mhorowitz/0/head -> origin/gh/mhorowitz/0/head 2025-12-04T11:11:09.6281469Z * [new branch] gh/mhorowitz/1/base -> origin/gh/mhorowitz/1/base 2025-12-04T11:11:09.6281546Z * [new branch] gh/mhorowitz/1/head -> origin/gh/mhorowitz/1/head 2025-12-04T11:11:09.6281619Z * [new branch] gh/mhorowitz/2/base -> origin/gh/mhorowitz/2/base 2025-12-04T11:11:09.6281693Z * [new branch] gh/mhorowitz/2/head -> origin/gh/mhorowitz/2/head 2025-12-04T11:11:09.6281771Z * [new branch] gh/mhorowitz/3/base -> origin/gh/mhorowitz/3/base 2025-12-04T11:11:09.6281846Z * [new branch] gh/mhorowitz/3/head -> origin/gh/mhorowitz/3/head 2025-12-04T11:11:09.6281920Z * [new branch] gh/mhorowitz/4/base -> origin/gh/mhorowitz/4/base 2025-12-04T11:11:09.6282001Z * [new branch] gh/mhorowitz/4/head -> origin/gh/mhorowitz/4/head 2025-12-04T11:11:09.6282075Z * [new branch] gh/mhorowitz/5/base -> origin/gh/mhorowitz/5/base 2025-12-04T11:11:09.6282149Z * [new branch] gh/mhorowitz/5/head -> origin/gh/mhorowitz/5/head 2025-12-04T11:11:09.6282227Z * [new branch] gh/mhorowitz/6/base -> origin/gh/mhorowitz/6/base 2025-12-04T11:11:09.6282302Z * [new branch] gh/mhorowitz/6/head -> origin/gh/mhorowitz/6/head 2025-12-04T11:11:09.6282410Z * [new branch] gh/mikaylagawarecki/234/base -> origin/gh/mikaylagawarecki/234/base 2025-12-04T11:11:09.6282517Z * [new branch] gh/mikaylagawarecki/234/head -> origin/gh/mikaylagawarecki/234/head 2025-12-04T11:11:09.6282617Z * [new branch] gh/mikaylagawarecki/235/base -> origin/gh/mikaylagawarecki/235/base 2025-12-04T11:11:09.6282714Z * [new branch] gh/mikaylagawarecki/235/head -> origin/gh/mikaylagawarecki/235/head 2025-12-04T11:11:09.6282812Z * [new branch] gh/mikaylagawarecki/236/base -> origin/gh/mikaylagawarecki/236/base 2025-12-04T11:11:09.6282908Z * [new branch] gh/mikaylagawarecki/236/head -> origin/gh/mikaylagawarecki/236/head 2025-12-04T11:11:09.6283007Z * [new branch] gh/mikaylagawarecki/237/base -> origin/gh/mikaylagawarecki/237/base 2025-12-04T11:11:09.6283103Z * [new branch] gh/mikaylagawarecki/237/head -> origin/gh/mikaylagawarecki/237/head 2025-12-04T11:11:09.6283198Z * [new branch] gh/mikaylagawarecki/238/base -> origin/gh/mikaylagawarecki/238/base 2025-12-04T11:11:09.6283298Z * [new branch] gh/mikaylagawarecki/238/head -> origin/gh/mikaylagawarecki/238/head 2025-12-04T11:11:09.6283396Z * [new branch] gh/mikaylagawarecki/336/base -> origin/gh/mikaylagawarecki/336/base 2025-12-04T11:11:09.6283491Z * [new branch] gh/mikaylagawarecki/336/head -> origin/gh/mikaylagawarecki/336/head 2025-12-04T11:11:09.6283594Z * [new branch] gh/mikaylagawarecki/336/orig -> origin/gh/mikaylagawarecki/336/orig 2025-12-04T11:11:09.6283689Z * [new branch] gh/mikaylagawarecki/341/base -> origin/gh/mikaylagawarecki/341/base 2025-12-04T11:11:09.6283785Z * [new branch] gh/mikaylagawarecki/341/head -> origin/gh/mikaylagawarecki/341/head 2025-12-04T11:11:09.6283886Z * [new branch] gh/mikaylagawarecki/341/orig -> origin/gh/mikaylagawarecki/341/orig 2025-12-04T11:11:09.6283982Z * [new branch] gh/mikaylagawarecki/342/base -> origin/gh/mikaylagawarecki/342/base 2025-12-04T11:11:09.6284078Z * [new branch] gh/mikaylagawarecki/342/head -> origin/gh/mikaylagawarecki/342/head 2025-12-04T11:11:09.6284199Z * [new branch] gh/mikaylagawarecki/342/orig -> origin/gh/mikaylagawarecki/342/orig 2025-12-04T11:11:09.6284296Z * [new branch] gh/mikaylagawarecki/345/base -> origin/gh/mikaylagawarecki/345/base 2025-12-04T11:11:09.6284419Z * [new branch] gh/mikaylagawarecki/345/head -> origin/gh/mikaylagawarecki/345/head 2025-12-04T11:11:09.6284517Z * [new branch] gh/mikaylagawarecki/345/orig -> origin/gh/mikaylagawarecki/345/orig 2025-12-04T11:11:09.6284613Z * [new branch] gh/mikaylagawarecki/346/base -> origin/gh/mikaylagawarecki/346/base 2025-12-04T11:11:09.6284715Z * [new branch] gh/mikaylagawarecki/346/head -> origin/gh/mikaylagawarecki/346/head 2025-12-04T11:11:09.6284811Z * [new branch] gh/mikaylagawarecki/346/orig -> origin/gh/mikaylagawarecki/346/orig 2025-12-04T11:11:09.6284907Z * [new branch] gh/mikaylagawarecki/347/base -> origin/gh/mikaylagawarecki/347/base 2025-12-04T11:11:09.6285009Z * [new branch] gh/mikaylagawarecki/347/head -> origin/gh/mikaylagawarecki/347/head 2025-12-04T11:11:09.6285106Z * [new branch] gh/mikaylagawarecki/347/orig -> origin/gh/mikaylagawarecki/347/orig 2025-12-04T11:11:09.6285203Z * [new branch] gh/mikaylagawarecki/350/base -> origin/gh/mikaylagawarecki/350/base 2025-12-04T11:11:09.6285302Z * [new branch] gh/mikaylagawarecki/350/head -> origin/gh/mikaylagawarecki/350/head 2025-12-04T11:11:09.6285398Z * [new branch] gh/mikaylagawarecki/350/orig -> origin/gh/mikaylagawarecki/350/orig 2025-12-04T11:11:09.6285494Z * [new branch] gh/mikaylagawarecki/351/base -> origin/gh/mikaylagawarecki/351/base 2025-12-04T11:11:09.6285598Z * [new branch] gh/mikaylagawarecki/351/head -> origin/gh/mikaylagawarecki/351/head 2025-12-04T11:11:09.6285694Z * [new branch] gh/mikaylagawarecki/351/orig -> origin/gh/mikaylagawarecki/351/orig 2025-12-04T11:11:09.6285792Z * [new branch] gh/mikaylagawarecki/352/base -> origin/gh/mikaylagawarecki/352/base 2025-12-04T11:11:09.6285892Z * [new branch] gh/mikaylagawarecki/352/head -> origin/gh/mikaylagawarecki/352/head 2025-12-04T11:11:09.6285986Z * [new branch] gh/mikaylagawarecki/352/orig -> origin/gh/mikaylagawarecki/352/orig 2025-12-04T11:11:09.6286085Z * [new branch] gh/mikaylagawarecki/353/base -> origin/gh/mikaylagawarecki/353/base 2025-12-04T11:11:09.6286182Z * [new branch] gh/mikaylagawarecki/353/head -> origin/gh/mikaylagawarecki/353/head 2025-12-04T11:11:09.6286275Z * [new branch] gh/mikaylagawarecki/353/orig -> origin/gh/mikaylagawarecki/353/orig 2025-12-04T11:11:09.6286373Z * [new branch] gh/mikaylagawarecki/354/base -> origin/gh/mikaylagawarecki/354/base 2025-12-04T11:11:09.6286469Z * [new branch] gh/mikaylagawarecki/354/head -> origin/gh/mikaylagawarecki/354/head 2025-12-04T11:11:09.6286566Z * [new branch] gh/mikaylagawarecki/354/orig -> origin/gh/mikaylagawarecki/354/orig 2025-12-04T11:11:09.6286667Z * [new branch] gh/mikaylagawarecki/356/base -> origin/gh/mikaylagawarecki/356/base 2025-12-04T11:11:09.6286765Z * [new branch] gh/mikaylagawarecki/356/head -> origin/gh/mikaylagawarecki/356/head 2025-12-04T11:11:09.6286858Z * [new branch] gh/mikaylagawarecki/356/orig -> origin/gh/mikaylagawarecki/356/orig 2025-12-04T11:11:09.6286960Z * [new branch] gh/mikaylagawarecki/357/base -> origin/gh/mikaylagawarecki/357/base 2025-12-04T11:11:09.6287057Z * [new branch] gh/mikaylagawarecki/357/head -> origin/gh/mikaylagawarecki/357/head 2025-12-04T11:11:09.6287154Z * [new branch] gh/mikaylagawarecki/357/orig -> origin/gh/mikaylagawarecki/357/orig 2025-12-04T11:11:09.6287254Z * [new branch] gh/mikaylagawarecki/359/base -> origin/gh/mikaylagawarecki/359/base 2025-12-04T11:11:09.6287370Z * [new branch] gh/mikaylagawarecki/359/head -> origin/gh/mikaylagawarecki/359/head 2025-12-04T11:11:09.6287466Z * [new branch] gh/mikaylagawarecki/359/orig -> origin/gh/mikaylagawarecki/359/orig 2025-12-04T11:11:09.6287584Z * [new branch] gh/mikaylagawarecki/360/base -> origin/gh/mikaylagawarecki/360/base 2025-12-04T11:11:09.6287680Z * [new branch] gh/mikaylagawarecki/360/head -> origin/gh/mikaylagawarecki/360/head 2025-12-04T11:11:09.6287781Z * [new branch] gh/mikaylagawarecki/360/orig -> origin/gh/mikaylagawarecki/360/orig 2025-12-04T11:11:09.6287876Z * [new branch] gh/mikaylagawarecki/361/base -> origin/gh/mikaylagawarecki/361/base 2025-12-04T11:11:09.6287971Z * [new branch] gh/mikaylagawarecki/361/head -> origin/gh/mikaylagawarecki/361/head 2025-12-04T11:11:09.6288071Z * [new branch] gh/mikaylagawarecki/361/orig -> origin/gh/mikaylagawarecki/361/orig 2025-12-04T11:11:09.6288200Z * [new branch] gh/mikaylagawarecki/362/base -> origin/gh/mikaylagawarecki/362/base 2025-12-04T11:11:09.6288298Z * [new branch] gh/mikaylagawarecki/362/head -> origin/gh/mikaylagawarecki/362/head 2025-12-04T11:11:09.6288399Z * [new branch] gh/mikaylagawarecki/362/orig -> origin/gh/mikaylagawarecki/362/orig 2025-12-04T11:11:09.6288494Z * [new branch] gh/mikaylagawarecki/363/base -> origin/gh/mikaylagawarecki/363/base 2025-12-04T11:11:09.6288592Z * [new branch] gh/mikaylagawarecki/363/head -> origin/gh/mikaylagawarecki/363/head 2025-12-04T11:11:09.6288690Z * [new branch] gh/mikaylagawarecki/363/orig -> origin/gh/mikaylagawarecki/363/orig 2025-12-04T11:11:09.6288787Z * [new branch] gh/mikaylagawarecki/364/base -> origin/gh/mikaylagawarecki/364/base 2025-12-04T11:11:09.6288883Z * [new branch] gh/mikaylagawarecki/364/head -> origin/gh/mikaylagawarecki/364/head 2025-12-04T11:11:09.6288982Z * [new branch] gh/mikaylagawarecki/364/orig -> origin/gh/mikaylagawarecki/364/orig 2025-12-04T11:11:09.6289078Z * [new branch] gh/mikaylagawarecki/365/base -> origin/gh/mikaylagawarecki/365/base 2025-12-04T11:11:09.6289181Z * [new branch] gh/mikaylagawarecki/365/head -> origin/gh/mikaylagawarecki/365/head 2025-12-04T11:11:09.6289278Z * [new branch] gh/mikaylagawarecki/365/orig -> origin/gh/mikaylagawarecki/365/orig 2025-12-04T11:11:09.6289373Z * [new branch] gh/mikaylagawarecki/366/base -> origin/gh/mikaylagawarecki/366/base 2025-12-04T11:11:09.6289476Z * [new branch] gh/mikaylagawarecki/366/head -> origin/gh/mikaylagawarecki/366/head 2025-12-04T11:11:09.6289573Z * [new branch] gh/mikaylagawarecki/366/orig -> origin/gh/mikaylagawarecki/366/orig 2025-12-04T11:11:09.6289669Z * [new branch] gh/mikaylagawarecki/367/base -> origin/gh/mikaylagawarecki/367/base 2025-12-04T11:11:09.6289774Z * [new branch] gh/mikaylagawarecki/367/head -> origin/gh/mikaylagawarecki/367/head 2025-12-04T11:11:09.6289870Z * [new branch] gh/mikaylagawarecki/367/orig -> origin/gh/mikaylagawarecki/367/orig 2025-12-04T11:11:09.6289966Z * [new branch] gh/mikaylagawarecki/368/base -> origin/gh/mikaylagawarecki/368/base 2025-12-04T11:11:09.6290068Z * [new branch] gh/mikaylagawarecki/368/head -> origin/gh/mikaylagawarecki/368/head 2025-12-04T11:11:09.6290164Z * [new branch] gh/mikaylagawarecki/368/orig -> origin/gh/mikaylagawarecki/368/orig 2025-12-04T11:11:09.6290259Z * [new branch] gh/mikaylagawarecki/369/base -> origin/gh/mikaylagawarecki/369/base 2025-12-04T11:11:09.6290362Z * [new branch] gh/mikaylagawarecki/369/head -> origin/gh/mikaylagawarecki/369/head 2025-12-04T11:11:09.6290460Z * [new branch] gh/mikaylagawarecki/369/orig -> origin/gh/mikaylagawarecki/369/orig 2025-12-04T11:11:09.6290598Z * [new branch] gh/mikaylagawarecki/370/base -> origin/gh/mikaylagawarecki/370/base 2025-12-04T11:11:09.6290696Z * [new branch] gh/mikaylagawarecki/370/head -> origin/gh/mikaylagawarecki/370/head 2025-12-04T11:11:09.6290819Z * [new branch] gh/mikaylagawarecki/370/orig -> origin/gh/mikaylagawarecki/370/orig 2025-12-04T11:11:09.6290920Z * [new branch] gh/mikaylagawarecki/371/base -> origin/gh/mikaylagawarecki/371/base 2025-12-04T11:11:09.6291016Z * [new branch] gh/mikaylagawarecki/371/head -> origin/gh/mikaylagawarecki/371/head 2025-12-04T11:11:09.6291111Z * [new branch] gh/mikaylagawarecki/371/orig -> origin/gh/mikaylagawarecki/371/orig 2025-12-04T11:11:09.6291213Z * [new branch] gh/mikaylagawarecki/372/base -> origin/gh/mikaylagawarecki/372/base 2025-12-04T11:11:09.6291309Z * [new branch] gh/mikaylagawarecki/372/head -> origin/gh/mikaylagawarecki/372/head 2025-12-04T11:11:09.6291406Z * [new branch] gh/mikaylagawarecki/372/orig -> origin/gh/mikaylagawarecki/372/orig 2025-12-04T11:11:09.6291507Z * [new branch] gh/mikaylagawarecki/373/base -> origin/gh/mikaylagawarecki/373/base 2025-12-04T11:11:09.6291604Z * [new branch] gh/mikaylagawarecki/373/head -> origin/gh/mikaylagawarecki/373/head 2025-12-04T11:11:09.6291700Z * [new branch] gh/mikaylagawarecki/373/orig -> origin/gh/mikaylagawarecki/373/orig 2025-12-04T11:11:09.6291800Z * [new branch] gh/mikaylagawarecki/374/base -> origin/gh/mikaylagawarecki/374/base 2025-12-04T11:11:09.6291895Z * [new branch] gh/mikaylagawarecki/374/head -> origin/gh/mikaylagawarecki/374/head 2025-12-04T11:11:09.6291990Z * [new branch] gh/mikaylagawarecki/374/orig -> origin/gh/mikaylagawarecki/374/orig 2025-12-04T11:11:09.6292088Z * [new branch] gh/mikaylagawarecki/375/base -> origin/gh/mikaylagawarecki/375/base 2025-12-04T11:11:09.6292188Z * [new branch] gh/mikaylagawarecki/375/head -> origin/gh/mikaylagawarecki/375/head 2025-12-04T11:11:09.6292285Z * [new branch] gh/mikaylagawarecki/375/orig -> origin/gh/mikaylagawarecki/375/orig 2025-12-04T11:11:09.6292380Z * [new branch] gh/mikaylagawarecki/376/base -> origin/gh/mikaylagawarecki/376/base 2025-12-04T11:11:09.6292474Z * [new branch] gh/mikaylagawarecki/376/head -> origin/gh/mikaylagawarecki/376/head 2025-12-04T11:11:09.6292574Z * [new branch] gh/mikaylagawarecki/376/orig -> origin/gh/mikaylagawarecki/376/orig 2025-12-04T11:11:09.6292670Z * [new branch] gh/mikaylagawarecki/377/base -> origin/gh/mikaylagawarecki/377/base 2025-12-04T11:11:09.6292766Z * [new branch] gh/mikaylagawarecki/377/head -> origin/gh/mikaylagawarecki/377/head 2025-12-04T11:11:09.6292870Z * [new branch] gh/mikaylagawarecki/377/orig -> origin/gh/mikaylagawarecki/377/orig 2025-12-04T11:11:09.6292968Z * [new branch] gh/mikaylagawarecki/378/base -> origin/gh/mikaylagawarecki/378/base 2025-12-04T11:11:09.6293064Z * [new branch] gh/mikaylagawarecki/378/head -> origin/gh/mikaylagawarecki/378/head 2025-12-04T11:11:09.6293167Z * [new branch] gh/mikaylagawarecki/378/orig -> origin/gh/mikaylagawarecki/378/orig 2025-12-04T11:11:09.6293263Z * [new branch] gh/mikaylagawarecki/379/base -> origin/gh/mikaylagawarecki/379/base 2025-12-04T11:11:09.6293360Z * [new branch] gh/mikaylagawarecki/379/head -> origin/gh/mikaylagawarecki/379/head 2025-12-04T11:11:09.6293462Z * [new branch] gh/mikaylagawarecki/379/orig -> origin/gh/mikaylagawarecki/379/orig 2025-12-04T11:11:09.6293560Z * [new branch] gh/mikaylagawarecki/380/base -> origin/gh/mikaylagawarecki/380/base 2025-12-04T11:11:09.6293662Z * [new branch] gh/mikaylagawarecki/380/head -> origin/gh/mikaylagawarecki/380/head 2025-12-04T11:11:09.6293778Z * [new branch] gh/mikaylagawarecki/380/orig -> origin/gh/mikaylagawarecki/380/orig 2025-12-04T11:11:09.6293873Z * [new branch] gh/mikaylagawarecki/381/base -> origin/gh/mikaylagawarecki/381/base 2025-12-04T11:11:09.6293995Z * [new branch] gh/mikaylagawarecki/381/head -> origin/gh/mikaylagawarecki/381/head 2025-12-04T11:11:09.6294093Z * [new branch] gh/mikaylagawarecki/381/orig -> origin/gh/mikaylagawarecki/381/orig 2025-12-04T11:11:09.6294190Z * [new branch] gh/mikaylagawarecki/382/base -> origin/gh/mikaylagawarecki/382/base 2025-12-04T11:11:09.6294291Z * [new branch] gh/mikaylagawarecki/382/head -> origin/gh/mikaylagawarecki/382/head 2025-12-04T11:11:09.6294388Z * [new branch] gh/mikaylagawarecki/382/orig -> origin/gh/mikaylagawarecki/382/orig 2025-12-04T11:11:09.6294484Z * [new branch] gh/mikaylagawarecki/383/base -> origin/gh/mikaylagawarecki/383/base 2025-12-04T11:11:09.6294586Z * [new branch] gh/mikaylagawarecki/383/head -> origin/gh/mikaylagawarecki/383/head 2025-12-04T11:11:09.6294680Z * [new branch] gh/mikaylagawarecki/383/orig -> origin/gh/mikaylagawarecki/383/orig 2025-12-04T11:11:09.6294777Z * [new branch] gh/mikaylagawarecki/384/base -> origin/gh/mikaylagawarecki/384/base 2025-12-04T11:11:09.6294877Z * [new branch] gh/mikaylagawarecki/384/head -> origin/gh/mikaylagawarecki/384/head 2025-12-04T11:11:09.6294972Z * [new branch] gh/mikaylagawarecki/384/orig -> origin/gh/mikaylagawarecki/384/orig 2025-12-04T11:11:09.6295068Z * [new branch] gh/mikaylagawarecki/385/base -> origin/gh/mikaylagawarecki/385/base 2025-12-04T11:11:09.6295172Z * [new branch] gh/mikaylagawarecki/385/head -> origin/gh/mikaylagawarecki/385/head 2025-12-04T11:11:09.6295267Z * [new branch] gh/mikaylagawarecki/385/orig -> origin/gh/mikaylagawarecki/385/orig 2025-12-04T11:11:09.6295369Z * [new branch] gh/mikaylagawarecki/386/base -> origin/gh/mikaylagawarecki/386/base 2025-12-04T11:11:09.6295464Z * [new branch] gh/mikaylagawarecki/386/head -> origin/gh/mikaylagawarecki/386/head 2025-12-04T11:11:09.6295558Z * [new branch] gh/mikaylagawarecki/386/orig -> origin/gh/mikaylagawarecki/386/orig 2025-12-04T11:11:09.6295653Z * [new branch] gh/mikaylagawarecki/387/base -> origin/gh/mikaylagawarecki/387/base 2025-12-04T11:11:09.6295753Z * [new branch] gh/mikaylagawarecki/387/head -> origin/gh/mikaylagawarecki/387/head 2025-12-04T11:11:09.6295849Z * [new branch] gh/mikaylagawarecki/387/orig -> origin/gh/mikaylagawarecki/387/orig 2025-12-04T11:11:09.6295943Z * [new branch] gh/mikaylagawarecki/388/base -> origin/gh/mikaylagawarecki/388/base 2025-12-04T11:11:09.6296041Z * [new branch] gh/mikaylagawarecki/388/head -> origin/gh/mikaylagawarecki/388/head 2025-12-04T11:11:09.6296137Z * [new branch] gh/mikaylagawarecki/388/orig -> origin/gh/mikaylagawarecki/388/orig 2025-12-04T11:11:09.6296233Z * [new branch] gh/mikaylagawarecki/389/base -> origin/gh/mikaylagawarecki/389/base 2025-12-04T11:11:09.6296331Z * [new branch] gh/mikaylagawarecki/389/head -> origin/gh/mikaylagawarecki/389/head 2025-12-04T11:11:09.6296427Z * [new branch] gh/mikaylagawarecki/389/orig -> origin/gh/mikaylagawarecki/389/orig 2025-12-04T11:11:09.6296522Z * [new branch] gh/mikaylagawarecki/390/base -> origin/gh/mikaylagawarecki/390/base 2025-12-04T11:11:09.6296621Z * [new branch] gh/mikaylagawarecki/390/head -> origin/gh/mikaylagawarecki/390/head 2025-12-04T11:11:09.6296715Z * [new branch] gh/mikaylagawarecki/390/orig -> origin/gh/mikaylagawarecki/390/orig 2025-12-04T11:11:09.6296811Z * [new branch] gh/mikaylagawarecki/391/base -> origin/gh/mikaylagawarecki/391/base 2025-12-04T11:11:09.6296928Z * [new branch] gh/mikaylagawarecki/391/head -> origin/gh/mikaylagawarecki/391/head 2025-12-04T11:11:09.6297023Z * [new branch] gh/mikaylagawarecki/391/orig -> origin/gh/mikaylagawarecki/391/orig 2025-12-04T11:11:09.6297151Z * [new branch] gh/mikaylagawarecki/392/base -> origin/gh/mikaylagawarecki/392/base 2025-12-04T11:11:09.6297247Z * [new branch] gh/mikaylagawarecki/392/head -> origin/gh/mikaylagawarecki/392/head 2025-12-04T11:11:09.6297341Z * [new branch] gh/mikaylagawarecki/392/orig -> origin/gh/mikaylagawarecki/392/orig 2025-12-04T11:11:09.6297419Z * [new branch] gh/mlazos/41/base -> origin/gh/mlazos/41/base 2025-12-04T11:11:09.6297491Z * [new branch] gh/mlazos/41/head -> origin/gh/mlazos/41/head 2025-12-04T11:11:09.6297562Z * [new branch] gh/mlazos/41/orig -> origin/gh/mlazos/41/orig 2025-12-04T11:11:09.6297638Z * [new branch] gh/mlazos/42/base -> origin/gh/mlazos/42/base 2025-12-04T11:11:09.6297709Z * [new branch] gh/mlazos/42/head -> origin/gh/mlazos/42/head 2025-12-04T11:11:09.6297779Z * [new branch] gh/mlazos/42/orig -> origin/gh/mlazos/42/orig 2025-12-04T11:11:09.6297853Z * [new branch] gh/mlazos/43/base -> origin/gh/mlazos/43/base 2025-12-04T11:11:09.6297922Z * [new branch] gh/mlazos/43/head -> origin/gh/mlazos/43/head 2025-12-04T11:11:09.6297991Z * [new branch] gh/mlazos/43/orig -> origin/gh/mlazos/43/orig 2025-12-04T11:11:09.6298065Z * [new branch] gh/mlazos/44/base -> origin/gh/mlazos/44/base 2025-12-04T11:11:09.6298133Z * [new branch] gh/mlazos/44/head -> origin/gh/mlazos/44/head 2025-12-04T11:11:09.6298232Z * [new branch] gh/mlazos/44/orig -> origin/gh/mlazos/44/orig 2025-12-04T11:11:09.6298309Z * [new branch] gh/mlazos/47/base -> origin/gh/mlazos/47/base 2025-12-04T11:11:09.6298380Z * [new branch] gh/mlazos/47/head -> origin/gh/mlazos/47/head 2025-12-04T11:11:09.6298451Z * [new branch] gh/mlazos/47/orig -> origin/gh/mlazos/47/orig 2025-12-04T11:11:09.6298528Z * [new branch] gh/mlazos/48/base -> origin/gh/mlazos/48/base 2025-12-04T11:11:09.6298599Z * [new branch] gh/mlazos/48/head -> origin/gh/mlazos/48/head 2025-12-04T11:11:09.6298672Z * [new branch] gh/mlazos/48/orig -> origin/gh/mlazos/48/orig 2025-12-04T11:11:09.6298742Z * [new branch] gh/mlazos/49/base -> origin/gh/mlazos/49/base 2025-12-04T11:11:09.6298812Z * [new branch] gh/mlazos/49/head -> origin/gh/mlazos/49/head 2025-12-04T11:11:09.6298884Z * [new branch] gh/mlazos/49/orig -> origin/gh/mlazos/49/orig 2025-12-04T11:11:09.6298956Z * [new branch] gh/mlazos/50/base -> origin/gh/mlazos/50/base 2025-12-04T11:11:09.6299026Z * [new branch] gh/mlazos/50/head -> origin/gh/mlazos/50/head 2025-12-04T11:11:09.6299100Z * [new branch] gh/mlazos/50/orig -> origin/gh/mlazos/50/orig 2025-12-04T11:11:09.6299172Z * [new branch] gh/mlazos/51/base -> origin/gh/mlazos/51/base 2025-12-04T11:11:09.6299243Z * [new branch] gh/mlazos/51/head -> origin/gh/mlazos/51/head 2025-12-04T11:11:09.6299317Z * [new branch] gh/mlazos/51/orig -> origin/gh/mlazos/51/orig 2025-12-04T11:11:09.6299386Z * [new branch] gh/mlazos/52/base -> origin/gh/mlazos/52/base 2025-12-04T11:11:09.6299455Z * [new branch] gh/mlazos/52/head -> origin/gh/mlazos/52/head 2025-12-04T11:11:09.6299526Z * [new branch] gh/mlazos/52/orig -> origin/gh/mlazos/52/orig 2025-12-04T11:11:09.6299628Z * [new branch] gh/mlazos/53/base -> origin/gh/mlazos/53/base 2025-12-04T11:11:09.6299699Z * [new branch] gh/mlazos/53/head -> origin/gh/mlazos/53/head 2025-12-04T11:11:09.6299773Z * [new branch] gh/mlazos/53/orig -> origin/gh/mlazos/53/orig 2025-12-04T11:11:09.6299872Z * [new branch] gh/mlazos/54/base -> origin/gh/mlazos/54/base 2025-12-04T11:11:09.6299943Z * [new branch] gh/mlazos/54/head -> origin/gh/mlazos/54/head 2025-12-04T11:11:09.6300018Z * [new branch] gh/mlazos/54/orig -> origin/gh/mlazos/54/orig 2025-12-04T11:11:09.6300087Z * [new branch] gh/mlazos/55/base -> origin/gh/mlazos/55/base 2025-12-04T11:11:09.6300158Z * [new branch] gh/mlazos/55/head -> origin/gh/mlazos/55/head 2025-12-04T11:11:09.6300231Z * [new branch] gh/mlazos/55/orig -> origin/gh/mlazos/55/orig 2025-12-04T11:11:09.6300299Z * [new branch] gh/mlazos/56/base -> origin/gh/mlazos/56/base 2025-12-04T11:11:09.6300376Z * [new branch] gh/mlazos/56/head -> origin/gh/mlazos/56/head 2025-12-04T11:11:09.6300445Z * [new branch] gh/mlazos/56/orig -> origin/gh/mlazos/56/orig 2025-12-04T11:11:09.6300555Z * [new branch] gh/mlazos/57/base -> origin/gh/mlazos/57/base 2025-12-04T11:11:09.6300666Z * [new branch] gh/mlazos/57/head -> origin/gh/mlazos/57/head 2025-12-04T11:11:09.6300739Z * [new branch] gh/mlazos/57/orig -> origin/gh/mlazos/57/orig 2025-12-04T11:11:09.6300809Z * [new branch] gh/mlazos/58/base -> origin/gh/mlazos/58/base 2025-12-04T11:11:09.6300884Z * [new branch] gh/mlazos/58/head -> origin/gh/mlazos/58/head 2025-12-04T11:11:09.6300954Z * [new branch] gh/mlazos/58/orig -> origin/gh/mlazos/58/orig 2025-12-04T11:11:09.6301024Z * [new branch] gh/mlazos/59/base -> origin/gh/mlazos/59/base 2025-12-04T11:11:09.6301102Z * [new branch] gh/mlazos/59/head -> origin/gh/mlazos/59/head 2025-12-04T11:11:09.6301171Z * [new branch] gh/mlazos/59/orig -> origin/gh/mlazos/59/orig 2025-12-04T11:11:09.6301244Z * [new branch] gh/mlazos/60/base -> origin/gh/mlazos/60/base 2025-12-04T11:11:09.6301316Z * [new branch] gh/mlazos/60/head -> origin/gh/mlazos/60/head 2025-12-04T11:11:09.6301386Z * [new branch] gh/mlazos/60/orig -> origin/gh/mlazos/60/orig 2025-12-04T11:11:09.6301456Z * [new branch] gh/mlazos/61/base -> origin/gh/mlazos/61/base 2025-12-04T11:11:09.6301532Z * [new branch] gh/mlazos/61/head -> origin/gh/mlazos/61/head 2025-12-04T11:11:09.6301602Z * [new branch] gh/mlazos/61/orig -> origin/gh/mlazos/61/orig 2025-12-04T11:11:09.6301672Z * [new branch] gh/mlazos/62/base -> origin/gh/mlazos/62/base 2025-12-04T11:11:09.6301750Z * [new branch] gh/mlazos/62/head -> origin/gh/mlazos/62/head 2025-12-04T11:11:09.6301820Z * [new branch] gh/mlazos/62/orig -> origin/gh/mlazos/62/orig 2025-12-04T11:11:09.6301891Z * [new branch] gh/mlazos/63/base -> origin/gh/mlazos/63/base 2025-12-04T11:11:09.6301965Z * [new branch] gh/mlazos/63/head -> origin/gh/mlazos/63/head 2025-12-04T11:11:09.6302034Z * [new branch] gh/mlazos/63/orig -> origin/gh/mlazos/63/orig 2025-12-04T11:11:09.6302107Z * [new branch] gh/mlazos/64/base -> origin/gh/mlazos/64/base 2025-12-04T11:11:09.6302176Z * [new branch] gh/mlazos/64/head -> origin/gh/mlazos/64/head 2025-12-04T11:11:09.6302246Z * [new branch] gh/mlazos/64/orig -> origin/gh/mlazos/64/orig 2025-12-04T11:11:09.6302321Z * [new branch] gh/mlazos/65/base -> origin/gh/mlazos/65/base 2025-12-04T11:11:09.6302416Z * [new branch] gh/mlazos/65/head -> origin/gh/mlazos/65/head 2025-12-04T11:11:09.6302487Z * [new branch] gh/mlazos/65/orig -> origin/gh/mlazos/65/orig 2025-12-04T11:11:09.6302561Z * [new branch] gh/mlazos/66/base -> origin/gh/mlazos/66/base 2025-12-04T11:11:09.6302653Z * [new branch] gh/mlazos/66/head -> origin/gh/mlazos/66/head 2025-12-04T11:11:09.6302722Z * [new branch] gh/mlazos/66/orig -> origin/gh/mlazos/66/orig 2025-12-04T11:11:09.6302795Z * [new branch] gh/mlazos/67/base -> origin/gh/mlazos/67/base 2025-12-04T11:11:09.6302865Z * [new branch] gh/mlazos/67/head -> origin/gh/mlazos/67/head 2025-12-04T11:11:09.6302934Z * [new branch] gh/mlazos/67/orig -> origin/gh/mlazos/67/orig 2025-12-04T11:11:09.6303008Z * [new branch] gh/mlazos/68/base -> origin/gh/mlazos/68/base 2025-12-04T11:11:09.6303077Z * [new branch] gh/mlazos/68/head -> origin/gh/mlazos/68/head 2025-12-04T11:11:09.6303146Z * [new branch] gh/mlazos/68/orig -> origin/gh/mlazos/68/orig 2025-12-04T11:11:09.6303221Z * [new branch] gh/mlazos/69/base -> origin/gh/mlazos/69/base 2025-12-04T11:11:09.6303291Z * [new branch] gh/mlazos/69/head -> origin/gh/mlazos/69/head 2025-12-04T11:11:09.6303361Z * [new branch] gh/mlazos/69/orig -> origin/gh/mlazos/69/orig 2025-12-04T11:11:09.6303435Z * [new branch] gh/mlazos/70/base -> origin/gh/mlazos/70/base 2025-12-04T11:11:09.6303505Z * [new branch] gh/mlazos/70/head -> origin/gh/mlazos/70/head 2025-12-04T11:11:09.6303573Z * [new branch] gh/mlazos/70/orig -> origin/gh/mlazos/70/orig 2025-12-04T11:11:09.6303646Z * [new branch] gh/mlazos/71/base -> origin/gh/mlazos/71/base 2025-12-04T11:11:09.6303716Z * [new branch] gh/mlazos/71/head -> origin/gh/mlazos/71/head 2025-12-04T11:11:09.6303786Z * [new branch] gh/mlazos/71/orig -> origin/gh/mlazos/71/orig 2025-12-04T11:11:09.6303860Z * [new branch] gh/mlazos/72/base -> origin/gh/mlazos/72/base 2025-12-04T11:11:09.6303930Z * [new branch] gh/mlazos/72/head -> origin/gh/mlazos/72/head 2025-12-04T11:11:09.6304002Z * [new branch] gh/mlazos/72/orig -> origin/gh/mlazos/72/orig 2025-12-04T11:11:09.6304071Z * [new branch] gh/mlazos/73/base -> origin/gh/mlazos/73/base 2025-12-04T11:11:09.6304141Z * [new branch] gh/mlazos/73/head -> origin/gh/mlazos/73/head 2025-12-04T11:11:09.6304212Z * [new branch] gh/mlazos/73/orig -> origin/gh/mlazos/73/orig 2025-12-04T11:11:09.6304284Z * [new branch] gh/mrmiywj/1/base -> origin/gh/mrmiywj/1/base 2025-12-04T11:11:09.6304356Z * [new branch] gh/mrmiywj/1/head -> origin/gh/mrmiywj/1/head 2025-12-04T11:11:09.6304437Z * [new branch] gh/muchulee8/73/base -> origin/gh/muchulee8/73/base 2025-12-04T11:11:09.6304514Z * [new branch] gh/muchulee8/73/head -> origin/gh/muchulee8/73/head 2025-12-04T11:11:09.6304591Z * [new branch] gh/muchulee8/73/orig -> origin/gh/muchulee8/73/orig 2025-12-04T11:11:09.6304684Z * [new branch] gh/naveenthangudu/1/base -> origin/gh/naveenthangudu/1/base 2025-12-04T11:11:09.6304773Z * [new branch] gh/naveenthangudu/1/head -> origin/gh/naveenthangudu/1/head 2025-12-04T11:11:09.6304857Z * [new branch] gh/naveenthangudu/1/orig -> origin/gh/naveenthangudu/1/orig 2025-12-04T11:11:09.6304945Z * [new branch] gh/naveenthangudu/2/base -> origin/gh/naveenthangudu/2/base 2025-12-04T11:11:09.6305027Z * [new branch] gh/naveenthangudu/2/head -> origin/gh/naveenthangudu/2/head 2025-12-04T11:11:09.6305130Z * [new branch] gh/naveenthangudu/2/orig -> origin/gh/naveenthangudu/2/orig 2025-12-04T11:11:09.6305218Z * [new branch] gh/naveenthangudu/3/base -> origin/gh/naveenthangudu/3/base 2025-12-04T11:11:09.6305301Z * [new branch] gh/naveenthangudu/3/head -> origin/gh/naveenthangudu/3/head 2025-12-04T11:11:09.6305413Z * [new branch] gh/naveenthangudu/3/orig -> origin/gh/naveenthangudu/3/orig 2025-12-04T11:11:09.6305500Z * [new branch] gh/naveenthangudu/4/base -> origin/gh/naveenthangudu/4/base 2025-12-04T11:11:09.6305585Z * [new branch] gh/naveenthangudu/4/head -> origin/gh/naveenthangudu/4/head 2025-12-04T11:11:09.6305675Z * [new branch] gh/naveenthangudu/4/orig -> origin/gh/naveenthangudu/4/orig 2025-12-04T11:11:09.6305759Z * [new branch] gh/naveenthangudu/5/base -> origin/gh/naveenthangudu/5/base 2025-12-04T11:11:09.6305842Z * [new branch] gh/naveenthangudu/5/head -> origin/gh/naveenthangudu/5/head 2025-12-04T11:11:09.6305929Z * [new branch] gh/naveenthangudu/5/orig -> origin/gh/naveenthangudu/5/orig 2025-12-04T11:11:09.6306011Z * [new branch] gh/naveenthangudu/6/base -> origin/gh/naveenthangudu/6/base 2025-12-04T11:11:09.6306096Z * [new branch] gh/naveenthangudu/6/head -> origin/gh/naveenthangudu/6/head 2025-12-04T11:11:09.6306180Z * [new branch] gh/naveenthangudu/6/orig -> origin/gh/naveenthangudu/6/orig 2025-12-04T11:11:09.6306262Z * [new branch] gh/naveenthangudu/7/base -> origin/gh/naveenthangudu/7/base 2025-12-04T11:11:09.6306344Z * [new branch] gh/naveenthangudu/7/head -> origin/gh/naveenthangudu/7/head 2025-12-04T11:11:09.6306429Z * [new branch] gh/naveenthangudu/7/orig -> origin/gh/naveenthangudu/7/orig 2025-12-04T11:11:09.6306511Z * [new branch] gh/naveenthangudu/8/base -> origin/gh/naveenthangudu/8/base 2025-12-04T11:11:09.6306595Z * [new branch] gh/naveenthangudu/8/head -> origin/gh/naveenthangudu/8/head 2025-12-04T11:11:09.6306681Z * [new branch] gh/naveenthangudu/8/orig -> origin/gh/naveenthangudu/8/orig 2025-12-04T11:11:09.6306763Z * [new branch] gh/naveenthangudu/9/base -> origin/gh/naveenthangudu/9/base 2025-12-04T11:11:09.6306847Z * [new branch] gh/naveenthangudu/9/head -> origin/gh/naveenthangudu/9/head 2025-12-04T11:11:09.6306933Z * [new branch] gh/naveenthangudu/9/orig -> origin/gh/naveenthangudu/9/orig 2025-12-04T11:11:09.6307009Z * [new branch] gh/nikitaved/1/base -> origin/gh/nikitaved/1/base 2025-12-04T11:11:09.6307086Z * [new branch] gh/nikitaved/1/head -> origin/gh/nikitaved/1/head 2025-12-04T11:11:09.6307166Z * [new branch] gh/nikitaved/1/orig -> origin/gh/nikitaved/1/orig 2025-12-04T11:11:09.6307245Z * [new branch] gh/nikitaved/10/base -> origin/gh/nikitaved/10/base 2025-12-04T11:11:09.6307328Z * [new branch] gh/nikitaved/10/head -> origin/gh/nikitaved/10/head 2025-12-04T11:11:09.6307405Z * [new branch] gh/nikitaved/10/orig -> origin/gh/nikitaved/10/orig 2025-12-04T11:11:09.6307482Z * [new branch] gh/nikitaved/11/base -> origin/gh/nikitaved/11/base 2025-12-04T11:11:09.6307561Z * [new branch] gh/nikitaved/11/head -> origin/gh/nikitaved/11/head 2025-12-04T11:11:09.6307635Z * [new branch] gh/nikitaved/11/orig -> origin/gh/nikitaved/11/orig 2025-12-04T11:11:09.6307709Z * [new branch] gh/nikitaved/12/base -> origin/gh/nikitaved/12/base 2025-12-04T11:11:09.6307787Z * [new branch] gh/nikitaved/12/head -> origin/gh/nikitaved/12/head 2025-12-04T11:11:09.6307861Z * [new branch] gh/nikitaved/12/orig -> origin/gh/nikitaved/12/orig 2025-12-04T11:11:09.6307935Z * [new branch] gh/nikitaved/13/base -> origin/gh/nikitaved/13/base 2025-12-04T11:11:09.6308042Z * [new branch] gh/nikitaved/13/head -> origin/gh/nikitaved/13/head 2025-12-04T11:11:09.6308117Z * [new branch] gh/nikitaved/13/orig -> origin/gh/nikitaved/13/orig 2025-12-04T11:11:09.6308262Z * [new branch] gh/nikitaved/14/base -> origin/gh/nikitaved/14/base 2025-12-04T11:11:09.6308341Z * [new branch] gh/nikitaved/14/head -> origin/gh/nikitaved/14/head 2025-12-04T11:11:09.6308415Z * [new branch] gh/nikitaved/14/orig -> origin/gh/nikitaved/14/orig 2025-12-04T11:11:09.6308489Z * [new branch] gh/nikitaved/15/base -> origin/gh/nikitaved/15/base 2025-12-04T11:11:09.6308567Z * [new branch] gh/nikitaved/15/head -> origin/gh/nikitaved/15/head 2025-12-04T11:11:09.6308641Z * [new branch] gh/nikitaved/15/orig -> origin/gh/nikitaved/15/orig 2025-12-04T11:11:09.6308716Z * [new branch] gh/nikitaved/16/base -> origin/gh/nikitaved/16/base 2025-12-04T11:11:09.6308798Z * [new branch] gh/nikitaved/16/head -> origin/gh/nikitaved/16/head 2025-12-04T11:11:09.6308870Z * [new branch] gh/nikitaved/16/orig -> origin/gh/nikitaved/16/orig 2025-12-04T11:11:09.6308949Z * [new branch] gh/nikitaved/2/base -> origin/gh/nikitaved/2/base 2025-12-04T11:11:09.6309024Z * [new branch] gh/nikitaved/2/head -> origin/gh/nikitaved/2/head 2025-12-04T11:11:09.6309099Z * [new branch] gh/nikitaved/2/orig -> origin/gh/nikitaved/2/orig 2025-12-04T11:11:09.6309176Z * [new branch] gh/nikitaved/4/base -> origin/gh/nikitaved/4/base 2025-12-04T11:11:09.6309249Z * [new branch] gh/nikitaved/4/head -> origin/gh/nikitaved/4/head 2025-12-04T11:11:09.6309322Z * [new branch] gh/nikitaved/4/orig -> origin/gh/nikitaved/4/orig 2025-12-04T11:11:09.6309397Z * [new branch] gh/nikitaved/5/base -> origin/gh/nikitaved/5/base 2025-12-04T11:11:09.6309471Z * [new branch] gh/nikitaved/5/head -> origin/gh/nikitaved/5/head 2025-12-04T11:11:09.6309550Z * [new branch] gh/nikitaved/5/orig -> origin/gh/nikitaved/5/orig 2025-12-04T11:11:09.6309627Z * [new branch] gh/nikitaved/6/base -> origin/gh/nikitaved/6/base 2025-12-04T11:11:09.6309701Z * [new branch] gh/nikitaved/6/head -> origin/gh/nikitaved/6/head 2025-12-04T11:11:09.6309780Z * [new branch] gh/nikitaved/6/orig -> origin/gh/nikitaved/6/orig 2025-12-04T11:11:09.6309853Z * [new branch] gh/nikitaved/8/base -> origin/gh/nikitaved/8/base 2025-12-04T11:11:09.6309927Z * [new branch] gh/nikitaved/8/head -> origin/gh/nikitaved/8/head 2025-12-04T11:11:09.6310004Z * [new branch] gh/nikitaved/8/orig -> origin/gh/nikitaved/8/orig 2025-12-04T11:11:09.6310077Z * [new branch] gh/nikitaved/9/base -> origin/gh/nikitaved/9/base 2025-12-04T11:11:09.6310153Z * [new branch] gh/nikitaved/9/head -> origin/gh/nikitaved/9/head 2025-12-04T11:11:09.6310232Z * [new branch] gh/nikitaved/9/orig -> origin/gh/nikitaved/9/orig 2025-12-04T11:11:09.6310307Z * [new branch] gh/oulgen/10/base -> origin/gh/oulgen/10/base 2025-12-04T11:11:09.6310378Z * [new branch] gh/oulgen/10/head -> origin/gh/oulgen/10/head 2025-12-04T11:11:09.6310450Z * [new branch] gh/oulgen/10/orig -> origin/gh/oulgen/10/orig 2025-12-04T11:11:09.6310519Z * [new branch] gh/oulgen/11/base -> origin/gh/oulgen/11/base 2025-12-04T11:11:09.6310588Z * [new branch] gh/oulgen/11/head -> origin/gh/oulgen/11/head 2025-12-04T11:11:09.6310659Z * [new branch] gh/oulgen/11/orig -> origin/gh/oulgen/11/orig 2025-12-04T11:11:09.6310727Z * [new branch] gh/oulgen/12/base -> origin/gh/oulgen/12/base 2025-12-04T11:11:09.6310832Z * [new branch] gh/oulgen/12/head -> origin/gh/oulgen/12/head 2025-12-04T11:11:09.6310902Z * [new branch] gh/oulgen/12/orig -> origin/gh/oulgen/12/orig 2025-12-04T11:11:09.6310999Z * [new branch] gh/oulgen/13/base -> origin/gh/oulgen/13/base 2025-12-04T11:11:09.6311071Z * [new branch] gh/oulgen/13/head -> origin/gh/oulgen/13/head 2025-12-04T11:11:09.6311139Z * [new branch] gh/oulgen/13/orig -> origin/gh/oulgen/13/orig 2025-12-04T11:11:09.6311208Z * [new branch] gh/oulgen/14/base -> origin/gh/oulgen/14/base 2025-12-04T11:11:09.6311281Z * [new branch] gh/oulgen/14/head -> origin/gh/oulgen/14/head 2025-12-04T11:11:09.6311350Z * [new branch] gh/oulgen/14/orig -> origin/gh/oulgen/14/orig 2025-12-04T11:11:09.6311419Z * [new branch] gh/oulgen/15/base -> origin/gh/oulgen/15/base 2025-12-04T11:11:09.6311492Z * [new branch] gh/oulgen/15/head -> origin/gh/oulgen/15/head 2025-12-04T11:11:09.6311562Z * [new branch] gh/oulgen/15/orig -> origin/gh/oulgen/15/orig 2025-12-04T11:11:09.6311631Z * [new branch] gh/oulgen/16/base -> origin/gh/oulgen/16/base 2025-12-04T11:11:09.6311704Z * [new branch] gh/oulgen/16/head -> origin/gh/oulgen/16/head 2025-12-04T11:11:09.6311772Z * [new branch] gh/oulgen/16/orig -> origin/gh/oulgen/16/orig 2025-12-04T11:11:09.6311843Z * [new branch] gh/oulgen/17/base -> origin/gh/oulgen/17/base 2025-12-04T11:11:09.6311916Z * [new branch] gh/oulgen/17/head -> origin/gh/oulgen/17/head 2025-12-04T11:11:09.6311985Z * [new branch] gh/oulgen/17/orig -> origin/gh/oulgen/17/orig 2025-12-04T11:11:09.6312055Z * [new branch] gh/oulgen/18/base -> origin/gh/oulgen/18/base 2025-12-04T11:11:09.6312130Z * [new branch] gh/oulgen/18/head -> origin/gh/oulgen/18/head 2025-12-04T11:11:09.6312199Z * [new branch] gh/oulgen/18/orig -> origin/gh/oulgen/18/orig 2025-12-04T11:11:09.6312270Z * [new branch] gh/oulgen/19/base -> origin/gh/oulgen/19/base 2025-12-04T11:11:09.6312343Z * [new branch] gh/oulgen/19/head -> origin/gh/oulgen/19/head 2025-12-04T11:11:09.6312412Z * [new branch] gh/oulgen/19/orig -> origin/gh/oulgen/19/orig 2025-12-04T11:11:09.6312480Z * [new branch] gh/oulgen/20/base -> origin/gh/oulgen/20/base 2025-12-04T11:11:09.6312552Z * [new branch] gh/oulgen/20/head -> origin/gh/oulgen/20/head 2025-12-04T11:11:09.6312621Z * [new branch] gh/oulgen/20/orig -> origin/gh/oulgen/20/orig 2025-12-04T11:11:09.6312693Z * [new branch] gh/oulgen/21/base -> origin/gh/oulgen/21/base 2025-12-04T11:11:09.6312762Z * [new branch] gh/oulgen/21/head -> origin/gh/oulgen/21/head 2025-12-04T11:11:09.6312830Z * [new branch] gh/oulgen/21/orig -> origin/gh/oulgen/21/orig 2025-12-04T11:11:09.6312899Z * [new branch] gh/oulgen/22/base -> origin/gh/oulgen/22/base 2025-12-04T11:11:09.6312969Z * [new branch] gh/oulgen/22/head -> origin/gh/oulgen/22/head 2025-12-04T11:11:09.6313038Z * [new branch] gh/oulgen/22/orig -> origin/gh/oulgen/22/orig 2025-12-04T11:11:09.6313111Z * [new branch] gh/oulgen/23/base -> origin/gh/oulgen/23/base 2025-12-04T11:11:09.6313179Z * [new branch] gh/oulgen/23/head -> origin/gh/oulgen/23/head 2025-12-04T11:11:09.6313247Z * [new branch] gh/oulgen/23/orig -> origin/gh/oulgen/23/orig 2025-12-04T11:11:09.6313319Z * [new branch] gh/oulgen/24/base -> origin/gh/oulgen/24/base 2025-12-04T11:11:09.6313410Z * [new branch] gh/oulgen/24/head -> origin/gh/oulgen/24/head 2025-12-04T11:11:09.6313479Z * [new branch] gh/oulgen/24/orig -> origin/gh/oulgen/24/orig 2025-12-04T11:11:09.6313550Z * [new branch] gh/oulgen/25/base -> origin/gh/oulgen/25/base 2025-12-04T11:11:09.6313641Z * [new branch] gh/oulgen/25/head -> origin/gh/oulgen/25/head 2025-12-04T11:11:09.6313709Z * [new branch] gh/oulgen/25/orig -> origin/gh/oulgen/25/orig 2025-12-04T11:11:09.6313781Z * [new branch] gh/oulgen/26/base -> origin/gh/oulgen/26/base 2025-12-04T11:11:09.6313848Z * [new branch] gh/oulgen/26/head -> origin/gh/oulgen/26/head 2025-12-04T11:11:09.6313916Z * [new branch] gh/oulgen/26/orig -> origin/gh/oulgen/26/orig 2025-12-04T11:11:09.6313989Z * [new branch] gh/oulgen/4/base -> origin/gh/oulgen/4/base 2025-12-04T11:11:09.6314058Z * [new branch] gh/oulgen/4/head -> origin/gh/oulgen/4/head 2025-12-04T11:11:09.6314128Z * [new branch] gh/oulgen/4/orig -> origin/gh/oulgen/4/orig 2025-12-04T11:11:09.6314202Z * [new branch] gh/oulgen/7/base -> origin/gh/oulgen/7/base 2025-12-04T11:11:09.6314272Z * [new branch] gh/oulgen/7/head -> origin/gh/oulgen/7/head 2025-12-04T11:11:09.6314344Z * [new branch] gh/oulgen/7/orig -> origin/gh/oulgen/7/orig 2025-12-04T11:11:09.6314411Z * [new branch] gh/oulgen/8/base -> origin/gh/oulgen/8/base 2025-12-04T11:11:09.6314478Z * [new branch] gh/oulgen/8/head -> origin/gh/oulgen/8/head 2025-12-04T11:11:09.6314549Z * [new branch] gh/oulgen/8/orig -> origin/gh/oulgen/8/orig 2025-12-04T11:11:09.6314617Z * [new branch] gh/oulgen/9/base -> origin/gh/oulgen/9/base 2025-12-04T11:11:09.6314685Z * [new branch] gh/oulgen/9/head -> origin/gh/oulgen/9/head 2025-12-04T11:11:09.6314757Z * [new branch] gh/oulgen/9/orig -> origin/gh/oulgen/9/orig 2025-12-04T11:11:09.6314865Z * [new branch] gh/patvig/mtia-serialization -> origin/gh/patvig/mtia-serialization 2025-12-04T11:11:09.6314939Z * [new branch] gh/pearu/108/base -> origin/gh/pearu/108/base 2025-12-04T11:11:09.6315012Z * [new branch] gh/pearu/108/head -> origin/gh/pearu/108/head 2025-12-04T11:11:09.6315081Z * [new branch] gh/pearu/108/orig -> origin/gh/pearu/108/orig 2025-12-04T11:11:09.6315151Z * [new branch] gh/pearu/109/base -> origin/gh/pearu/109/base 2025-12-04T11:11:09.6315222Z * [new branch] gh/pearu/109/head -> origin/gh/pearu/109/head 2025-12-04T11:11:09.6315291Z * [new branch] gh/pearu/109/orig -> origin/gh/pearu/109/orig 2025-12-04T11:11:09.6315360Z * [new branch] gh/pearu/110/base -> origin/gh/pearu/110/base 2025-12-04T11:11:09.6315433Z * [new branch] gh/pearu/110/head -> origin/gh/pearu/110/head 2025-12-04T11:11:09.6315502Z * [new branch] gh/pearu/110/orig -> origin/gh/pearu/110/orig 2025-12-04T11:11:09.6315572Z * [new branch] gh/pearu/111/base -> origin/gh/pearu/111/base 2025-12-04T11:11:09.6315643Z * [new branch] gh/pearu/111/head -> origin/gh/pearu/111/head 2025-12-04T11:11:09.6315714Z * [new branch] gh/pearu/111/orig -> origin/gh/pearu/111/orig 2025-12-04T11:11:09.6315784Z * [new branch] gh/pearu/112/base -> origin/gh/pearu/112/base 2025-12-04T11:11:09.6315856Z * [new branch] gh/pearu/112/head -> origin/gh/pearu/112/head 2025-12-04T11:11:09.6315925Z * [new branch] gh/pearu/112/orig -> origin/gh/pearu/112/orig 2025-12-04T11:11:09.6315997Z * [new branch] gh/pearu/115/base -> origin/gh/pearu/115/base 2025-12-04T11:11:09.6316090Z * [new branch] gh/pearu/115/head -> origin/gh/pearu/115/head 2025-12-04T11:11:09.6316160Z * [new branch] gh/pearu/115/orig -> origin/gh/pearu/115/orig 2025-12-04T11:11:09.6316256Z * [new branch] gh/pearu/116/base -> origin/gh/pearu/116/base 2025-12-04T11:11:09.6316326Z * [new branch] gh/pearu/116/head -> origin/gh/pearu/116/head 2025-12-04T11:11:09.6316395Z * [new branch] gh/pearu/116/orig -> origin/gh/pearu/116/orig 2025-12-04T11:11:09.6316469Z * [new branch] gh/pearu/117/base -> origin/gh/pearu/117/base 2025-12-04T11:11:09.6316538Z * [new branch] gh/pearu/117/head -> origin/gh/pearu/117/head 2025-12-04T11:11:09.6316606Z * [new branch] gh/pearu/117/orig -> origin/gh/pearu/117/orig 2025-12-04T11:11:09.6316677Z * [new branch] gh/pearu/118/base -> origin/gh/pearu/118/base 2025-12-04T11:11:09.6316747Z * [new branch] gh/pearu/118/head -> origin/gh/pearu/118/head 2025-12-04T11:11:09.6316817Z * [new branch] gh/pearu/118/orig -> origin/gh/pearu/118/orig 2025-12-04T11:11:09.6316889Z * [new branch] gh/pearu/119/base -> origin/gh/pearu/119/base 2025-12-04T11:11:09.6316959Z * [new branch] gh/pearu/119/head -> origin/gh/pearu/119/head 2025-12-04T11:11:09.6317027Z * [new branch] gh/pearu/119/orig -> origin/gh/pearu/119/orig 2025-12-04T11:11:09.6317098Z * [new branch] gh/pearu/139/base -> origin/gh/pearu/139/base 2025-12-04T11:11:09.6317167Z * [new branch] gh/pearu/139/head -> origin/gh/pearu/139/head 2025-12-04T11:11:09.6317237Z * [new branch] gh/pearu/139/orig -> origin/gh/pearu/139/orig 2025-12-04T11:11:09.6317308Z * [new branch] gh/pearu/140/base -> origin/gh/pearu/140/base 2025-12-04T11:11:09.6317378Z * [new branch] gh/pearu/140/head -> origin/gh/pearu/140/head 2025-12-04T11:11:09.6317447Z * [new branch] gh/pearu/140/orig -> origin/gh/pearu/140/orig 2025-12-04T11:11:09.6317520Z * [new branch] gh/pearu/142/base -> origin/gh/pearu/142/base 2025-12-04T11:11:09.6317591Z * [new branch] gh/pearu/142/head -> origin/gh/pearu/142/head 2025-12-04T11:11:09.6317666Z * [new branch] gh/pearu/142/orig -> origin/gh/pearu/142/orig 2025-12-04T11:11:09.6317735Z * [new branch] gh/pearu/143/base -> origin/gh/pearu/143/base 2025-12-04T11:11:09.6317805Z * [new branch] gh/pearu/143/head -> origin/gh/pearu/143/head 2025-12-04T11:11:09.6317878Z * [new branch] gh/pearu/143/orig -> origin/gh/pearu/143/orig 2025-12-04T11:11:09.6317949Z * [new branch] gh/pearu/147/base -> origin/gh/pearu/147/base 2025-12-04T11:11:09.6318019Z * [new branch] gh/pearu/147/head -> origin/gh/pearu/147/head 2025-12-04T11:11:09.6318092Z * [new branch] gh/pearu/147/orig -> origin/gh/pearu/147/orig 2025-12-04T11:11:09.6318199Z * [new branch] gh/pearu/149/base -> origin/gh/pearu/149/base 2025-12-04T11:11:09.6318269Z * [new branch] gh/pearu/149/head -> origin/gh/pearu/149/head 2025-12-04T11:11:09.6318340Z * [new branch] gh/pearu/149/orig -> origin/gh/pearu/149/orig 2025-12-04T11:11:09.6318408Z * [new branch] gh/pearu/150/base -> origin/gh/pearu/150/base 2025-12-04T11:11:09.6318477Z * [new branch] gh/pearu/150/head -> origin/gh/pearu/150/head 2025-12-04T11:11:09.6318549Z * [new branch] gh/pearu/150/orig -> origin/gh/pearu/150/orig 2025-12-04T11:11:09.6318619Z * [new branch] gh/pearu/151/base -> origin/gh/pearu/151/base 2025-12-04T11:11:09.6318727Z * [new branch] gh/pearu/151/head -> origin/gh/pearu/151/head 2025-12-04T11:11:09.6318800Z * [new branch] gh/pearu/151/orig -> origin/gh/pearu/151/orig 2025-12-04T11:11:09.6318869Z * [new branch] gh/pearu/152/base -> origin/gh/pearu/152/base 2025-12-04T11:11:09.6318966Z * [new branch] gh/pearu/152/head -> origin/gh/pearu/152/head 2025-12-04T11:11:09.6319039Z * [new branch] gh/pearu/152/orig -> origin/gh/pearu/152/orig 2025-12-04T11:11:09.6319108Z * [new branch] gh/pearu/153/base -> origin/gh/pearu/153/base 2025-12-04T11:11:09.6319176Z * [new branch] gh/pearu/153/head -> origin/gh/pearu/153/head 2025-12-04T11:11:09.6319246Z * [new branch] gh/pearu/153/orig -> origin/gh/pearu/153/orig 2025-12-04T11:11:09.6319317Z * [new branch] gh/pearu/154/base -> origin/gh/pearu/154/base 2025-12-04T11:11:09.6319386Z * [new branch] gh/pearu/154/head -> origin/gh/pearu/154/head 2025-12-04T11:11:09.6319458Z * [new branch] gh/pearu/154/orig -> origin/gh/pearu/154/orig 2025-12-04T11:11:09.6319527Z * [new branch] gh/pearu/155/base -> origin/gh/pearu/155/base 2025-12-04T11:11:09.6319600Z * [new branch] gh/pearu/155/head -> origin/gh/pearu/155/head 2025-12-04T11:11:09.6319667Z * [new branch] gh/pearu/155/orig -> origin/gh/pearu/155/orig 2025-12-04T11:11:09.6319736Z * [new branch] gh/pearu/156/base -> origin/gh/pearu/156/base 2025-12-04T11:11:09.6319808Z * [new branch] gh/pearu/156/head -> origin/gh/pearu/156/head 2025-12-04T11:11:09.6319877Z * [new branch] gh/pearu/156/orig -> origin/gh/pearu/156/orig 2025-12-04T11:11:09.6319945Z * [new branch] gh/pearu/56/base -> origin/gh/pearu/56/base 2025-12-04T11:11:09.6320016Z * [new branch] gh/pearu/56/head -> origin/gh/pearu/56/head 2025-12-04T11:11:09.6320086Z * [new branch] gh/pearu/56/orig -> origin/gh/pearu/56/orig 2025-12-04T11:11:09.6320154Z * [new branch] gh/pearu/97/base -> origin/gh/pearu/97/base 2025-12-04T11:11:09.6320227Z * [new branch] gh/pearu/97/head -> origin/gh/pearu/97/head 2025-12-04T11:11:09.6320294Z * [new branch] gh/pearu/97/orig -> origin/gh/pearu/97/orig 2025-12-04T11:11:09.6320373Z * [new branch] gh/pianpwk/21/base -> origin/gh/pianpwk/21/base 2025-12-04T11:11:09.6320450Z * [new branch] gh/pianpwk/21/head -> origin/gh/pianpwk/21/head 2025-12-04T11:11:09.6320523Z * [new branch] gh/pianpwk/28/base -> origin/gh/pianpwk/28/base 2025-12-04T11:11:09.6320595Z * [new branch] gh/pianpwk/28/head -> origin/gh/pianpwk/28/head 2025-12-04T11:11:09.6320670Z * [new branch] gh/pianpwk/28/orig -> origin/gh/pianpwk/28/orig 2025-12-04T11:11:09.6320746Z * [new branch] gh/pianpwk/29/base -> origin/gh/pianpwk/29/base 2025-12-04T11:11:09.6320817Z * [new branch] gh/pianpwk/29/head -> origin/gh/pianpwk/29/head 2025-12-04T11:11:09.6320892Z * [new branch] gh/pianpwk/29/orig -> origin/gh/pianpwk/29/orig 2025-12-04T11:11:09.6320964Z * [new branch] gh/pianpwk/30/base -> origin/gh/pianpwk/30/base 2025-12-04T11:11:09.6321036Z * [new branch] gh/pianpwk/30/head -> origin/gh/pianpwk/30/head 2025-12-04T11:11:09.6321113Z * [new branch] gh/pianpwk/30/orig -> origin/gh/pianpwk/30/orig 2025-12-04T11:11:09.6321184Z * [new branch] gh/pianpwk/31/base -> origin/gh/pianpwk/31/base 2025-12-04T11:11:09.6321259Z * [new branch] gh/pianpwk/31/head -> origin/gh/pianpwk/31/head 2025-12-04T11:11:09.6321331Z * [new branch] gh/pianpwk/31/orig -> origin/gh/pianpwk/31/orig 2025-12-04T11:11:09.6321425Z * [new branch] gh/pianpwk/32/base -> origin/gh/pianpwk/32/base 2025-12-04T11:11:09.6321501Z * [new branch] gh/pianpwk/32/head -> origin/gh/pianpwk/32/head 2025-12-04T11:11:09.6321593Z * [new branch] gh/pianpwk/32/orig -> origin/gh/pianpwk/32/orig 2025-12-04T11:11:09.6321664Z * [new branch] gh/pianpwk/33/base -> origin/gh/pianpwk/33/base 2025-12-04T11:11:09.6321739Z * [new branch] gh/pianpwk/33/head -> origin/gh/pianpwk/33/head 2025-12-04T11:11:09.6321810Z * [new branch] gh/pianpwk/33/orig -> origin/gh/pianpwk/33/orig 2025-12-04T11:11:09.6321881Z * [new branch] gh/pianpwk/34/base -> origin/gh/pianpwk/34/base 2025-12-04T11:11:09.6321956Z * [new branch] gh/pianpwk/34/head -> origin/gh/pianpwk/34/head 2025-12-04T11:11:09.6322026Z * [new branch] gh/pianpwk/34/orig -> origin/gh/pianpwk/34/orig 2025-12-04T11:11:09.6322098Z * [new branch] gh/pianpwk/35/base -> origin/gh/pianpwk/35/base 2025-12-04T11:11:09.6322171Z * [new branch] gh/pianpwk/35/head -> origin/gh/pianpwk/35/head 2025-12-04T11:11:09.6322242Z * [new branch] gh/pianpwk/35/orig -> origin/gh/pianpwk/35/orig 2025-12-04T11:11:09.6322314Z * [new branch] gh/rec/141/base -> origin/gh/rec/141/base 2025-12-04T11:11:09.6322386Z * [new branch] gh/rec/141/head -> origin/gh/rec/141/head 2025-12-04T11:11:09.6322454Z * [new branch] gh/rec/153/base -> origin/gh/rec/153/base 2025-12-04T11:11:09.6322519Z * [new branch] gh/rec/153/head -> origin/gh/rec/153/head 2025-12-04T11:11:09.6322589Z * [new branch] gh/rec/153/orig -> origin/gh/rec/153/orig 2025-12-04T11:11:09.6322655Z * [new branch] gh/rec/154/base -> origin/gh/rec/154/base 2025-12-04T11:11:09.6322721Z * [new branch] gh/rec/154/head -> origin/gh/rec/154/head 2025-12-04T11:11:09.6322790Z * [new branch] gh/rec/154/orig -> origin/gh/rec/154/orig 2025-12-04T11:11:09.6322857Z * [new branch] gh/rec/164/base -> origin/gh/rec/164/base 2025-12-04T11:11:09.6322924Z * [new branch] gh/rec/164/head -> origin/gh/rec/164/head 2025-12-04T11:11:09.6322992Z * [new branch] gh/rec/164/orig -> origin/gh/rec/164/orig 2025-12-04T11:11:09.6323058Z * [new branch] gh/rec/166/base -> origin/gh/rec/166/base 2025-12-04T11:11:09.6323127Z * [new branch] gh/rec/166/head -> origin/gh/rec/166/head 2025-12-04T11:11:09.6323192Z * [new branch] gh/rec/166/orig -> origin/gh/rec/166/orig 2025-12-04T11:11:09.6323258Z * [new branch] gh/rec/167/base -> origin/gh/rec/167/base 2025-12-04T11:11:09.6323328Z * [new branch] gh/rec/167/head -> origin/gh/rec/167/head 2025-12-04T11:11:09.6323396Z * [new branch] gh/rec/167/orig -> origin/gh/rec/167/orig 2025-12-04T11:11:09.6323461Z * [new branch] gh/rec/168/base -> origin/gh/rec/168/base 2025-12-04T11:11:09.6323533Z * [new branch] gh/rec/168/head -> origin/gh/rec/168/head 2025-12-04T11:11:09.6323599Z * [new branch] gh/rec/168/orig -> origin/gh/rec/168/orig 2025-12-04T11:11:09.6323665Z * [new branch] gh/rec/169/base -> origin/gh/rec/169/base 2025-12-04T11:11:09.6323735Z * [new branch] gh/rec/169/head -> origin/gh/rec/169/head 2025-12-04T11:11:09.6323799Z * [new branch] gh/rec/169/orig -> origin/gh/rec/169/orig 2025-12-04T11:11:09.6323865Z * [new branch] gh/rec/170/base -> origin/gh/rec/170/base 2025-12-04T11:11:09.6323933Z * [new branch] gh/rec/170/head -> origin/gh/rec/170/head 2025-12-04T11:11:09.6324018Z * [new branch] gh/rec/170/orig -> origin/gh/rec/170/orig 2025-12-04T11:11:09.6324085Z * [new branch] gh/rec/171/base -> origin/gh/rec/171/base 2025-12-04T11:11:09.6324174Z * [new branch] gh/rec/171/head -> origin/gh/rec/171/head 2025-12-04T11:11:09.6324239Z * [new branch] gh/rec/171/orig -> origin/gh/rec/171/orig 2025-12-04T11:11:09.6324304Z * [new branch] gh/rec/172/base -> origin/gh/rec/172/base 2025-12-04T11:11:09.6324373Z * [new branch] gh/rec/172/head -> origin/gh/rec/172/head 2025-12-04T11:11:09.6324438Z * [new branch] gh/rec/172/orig -> origin/gh/rec/172/orig 2025-12-04T11:11:09.6324505Z * [new branch] gh/rec/173/base -> origin/gh/rec/173/base 2025-12-04T11:11:09.6324573Z * [new branch] gh/rec/173/head -> origin/gh/rec/173/head 2025-12-04T11:11:09.6324640Z * [new branch] gh/rec/173/orig -> origin/gh/rec/173/orig 2025-12-04T11:11:09.6324705Z * [new branch] gh/rec/174/base -> origin/gh/rec/174/base 2025-12-04T11:11:09.6324774Z * [new branch] gh/rec/174/head -> origin/gh/rec/174/head 2025-12-04T11:11:09.6324842Z * [new branch] gh/rec/174/orig -> origin/gh/rec/174/orig 2025-12-04T11:11:09.6324911Z * [new branch] gh/rec/175/base -> origin/gh/rec/175/base 2025-12-04T11:11:09.6324977Z * [new branch] gh/rec/175/head -> origin/gh/rec/175/head 2025-12-04T11:11:09.6325042Z * [new branch] gh/rec/175/orig -> origin/gh/rec/175/orig 2025-12-04T11:11:09.6325110Z * [new branch] gh/rec/176/base -> origin/gh/rec/176/base 2025-12-04T11:11:09.6325173Z * [new branch] gh/rec/176/head -> origin/gh/rec/176/head 2025-12-04T11:11:09.6325239Z * [new branch] gh/rec/176/orig -> origin/gh/rec/176/orig 2025-12-04T11:11:09.6325307Z * [new branch] gh/rec/177/base -> origin/gh/rec/177/base 2025-12-04T11:11:09.6325373Z * [new branch] gh/rec/177/head -> origin/gh/rec/177/head 2025-12-04T11:11:09.6325441Z * [new branch] gh/rec/177/orig -> origin/gh/rec/177/orig 2025-12-04T11:11:09.6325535Z * [new branch] gh/robert-hardwick/3/base -> origin/gh/robert-hardwick/3/base 2025-12-04T11:11:09.6325623Z * [new branch] gh/robert-hardwick/3/head -> origin/gh/robert-hardwick/3/head 2025-12-04T11:11:09.6325708Z * [new branch] gh/robert-hardwick/3/orig -> origin/gh/robert-hardwick/3/orig 2025-12-04T11:11:09.6325795Z * [new branch] gh/robert-hardwick/4/base -> origin/gh/robert-hardwick/4/base 2025-12-04T11:11:09.6325880Z * [new branch] gh/robert-hardwick/4/head -> origin/gh/robert-hardwick/4/head 2025-12-04T11:11:09.6325965Z * [new branch] gh/robert-hardwick/4/orig -> origin/gh/robert-hardwick/4/orig 2025-12-04T11:11:09.6326052Z * [new branch] gh/robert-hardwick/5/base -> origin/gh/robert-hardwick/5/base 2025-12-04T11:11:09.6326136Z * [new branch] gh/robert-hardwick/5/head -> origin/gh/robert-hardwick/5/head 2025-12-04T11:11:09.6326221Z * [new branch] gh/robert-hardwick/5/orig -> origin/gh/robert-hardwick/5/orig 2025-12-04T11:11:09.6326310Z * [new branch] gh/robert-hardwick/6/base -> origin/gh/robert-hardwick/6/base 2025-12-04T11:11:09.6326394Z * [new branch] gh/robert-hardwick/6/head -> origin/gh/robert-hardwick/6/head 2025-12-04T11:11:09.6326478Z * [new branch] gh/robert-hardwick/6/orig -> origin/gh/robert-hardwick/6/orig 2025-12-04T11:11:09.6326564Z * [new branch] gh/robert-hardwick/7/base -> origin/gh/robert-hardwick/7/base 2025-12-04T11:11:09.6326648Z * [new branch] gh/robert-hardwick/7/head -> origin/gh/robert-hardwick/7/head 2025-12-04T11:11:09.6326754Z * [new branch] gh/robert-hardwick/7/orig -> origin/gh/robert-hardwick/7/orig 2025-12-04T11:11:09.6326841Z * [new branch] gh/robert-hardwick/8/base -> origin/gh/robert-hardwick/8/base 2025-12-04T11:11:09.6326950Z * [new branch] gh/robert-hardwick/8/head -> origin/gh/robert-hardwick/8/head 2025-12-04T11:11:09.6327038Z * [new branch] gh/robert-hardwick/8/orig -> origin/gh/robert-hardwick/8/orig 2025-12-04T11:11:09.6327121Z * [new branch] gh/robert-hardwick/9/base -> origin/gh/robert-hardwick/9/base 2025-12-04T11:11:09.6327207Z * [new branch] gh/robert-hardwick/9/head -> origin/gh/robert-hardwick/9/head 2025-12-04T11:11:09.6327295Z * [new branch] gh/robert-hardwick/9/orig -> origin/gh/robert-hardwick/9/orig 2025-12-04T11:11:09.6327367Z * [new branch] gh/rtimpe/1/base -> origin/gh/rtimpe/1/base 2025-12-04T11:11:09.6327440Z * [new branch] gh/rtimpe/1/head -> origin/gh/rtimpe/1/head 2025-12-04T11:11:09.6327513Z * [new branch] gh/rtimpe/2/base -> origin/gh/rtimpe/2/base 2025-12-04T11:11:09.6327581Z * [new branch] gh/rtimpe/2/head -> origin/gh/rtimpe/2/head 2025-12-04T11:11:09.6327654Z * [new branch] gh/rtimpe/22/base -> origin/gh/rtimpe/22/base 2025-12-04T11:11:09.6327727Z * [new branch] gh/rtimpe/22/head -> origin/gh/rtimpe/22/head 2025-12-04T11:11:09.6327798Z * [new branch] gh/rtimpe/22/orig -> origin/gh/rtimpe/22/orig 2025-12-04T11:11:09.6327867Z * [new branch] gh/rtimpe/23/base -> origin/gh/rtimpe/23/base 2025-12-04T11:11:09.6327940Z * [new branch] gh/rtimpe/23/head -> origin/gh/rtimpe/23/head 2025-12-04T11:11:09.6328010Z * [new branch] gh/rtimpe/23/orig -> origin/gh/rtimpe/23/orig 2025-12-04T11:11:09.6328080Z * [new branch] gh/rtimpe/24/base -> origin/gh/rtimpe/24/base 2025-12-04T11:11:09.6328194Z * [new branch] gh/rtimpe/24/head -> origin/gh/rtimpe/24/head 2025-12-04T11:11:09.6328264Z * [new branch] gh/rtimpe/24/orig -> origin/gh/rtimpe/24/orig 2025-12-04T11:11:09.6328338Z * [new branch] gh/rtimpe/25/base -> origin/gh/rtimpe/25/base 2025-12-04T11:11:09.6328407Z * [new branch] gh/rtimpe/25/head -> origin/gh/rtimpe/25/head 2025-12-04T11:11:09.6328477Z * [new branch] gh/rtimpe/25/orig -> origin/gh/rtimpe/25/orig 2025-12-04T11:11:09.6328548Z * [new branch] gh/rtimpe/26/base -> origin/gh/rtimpe/26/base 2025-12-04T11:11:09.6328617Z * [new branch] gh/rtimpe/26/head -> origin/gh/rtimpe/26/head 2025-12-04T11:11:09.6328686Z * [new branch] gh/rtimpe/26/orig -> origin/gh/rtimpe/26/orig 2025-12-04T11:11:09.6328760Z * [new branch] gh/rtimpe/27/base -> origin/gh/rtimpe/27/base 2025-12-04T11:11:09.6328829Z * [new branch] gh/rtimpe/27/head -> origin/gh/rtimpe/27/head 2025-12-04T11:11:09.6328898Z * [new branch] gh/rtimpe/27/orig -> origin/gh/rtimpe/27/orig 2025-12-04T11:11:09.6328972Z * [new branch] gh/rtimpe/28/base -> origin/gh/rtimpe/28/base 2025-12-04T11:11:09.6329041Z * [new branch] gh/rtimpe/28/head -> origin/gh/rtimpe/28/head 2025-12-04T11:11:09.6329109Z * [new branch] gh/rtimpe/28/orig -> origin/gh/rtimpe/28/orig 2025-12-04T11:11:09.6329182Z * [new branch] gh/rtimpe/29/base -> origin/gh/rtimpe/29/base 2025-12-04T11:11:09.6329254Z * [new branch] gh/rtimpe/29/head -> origin/gh/rtimpe/29/head 2025-12-04T11:11:09.6329325Z * [new branch] gh/rtimpe/29/orig -> origin/gh/rtimpe/29/orig 2025-12-04T11:11:09.6329397Z * [new branch] gh/rtimpe/3/base -> origin/gh/rtimpe/3/base 2025-12-04T11:11:09.6329499Z * [new branch] gh/rtimpe/3/head -> origin/gh/rtimpe/3/head 2025-12-04T11:11:09.6329571Z * [new branch] gh/rtimpe/30/base -> origin/gh/rtimpe/30/base 2025-12-04T11:11:09.6329668Z * [new branch] gh/rtimpe/30/head -> origin/gh/rtimpe/30/head 2025-12-04T11:11:09.6329738Z * [new branch] gh/rtimpe/30/orig -> origin/gh/rtimpe/30/orig 2025-12-04T11:11:09.6329806Z * [new branch] gh/rtimpe/31/base -> origin/gh/rtimpe/31/base 2025-12-04T11:11:09.6329878Z * [new branch] gh/rtimpe/31/head -> origin/gh/rtimpe/31/head 2025-12-04T11:11:09.6329946Z * [new branch] gh/rtimpe/31/orig -> origin/gh/rtimpe/31/orig 2025-12-04T11:11:09.6330019Z * [new branch] gh/rtimpe/32/base -> origin/gh/rtimpe/32/base 2025-12-04T11:11:09.6330088Z * [new branch] gh/rtimpe/32/head -> origin/gh/rtimpe/32/head 2025-12-04T11:11:09.6330159Z * [new branch] gh/rtimpe/32/orig -> origin/gh/rtimpe/32/orig 2025-12-04T11:11:09.6330230Z * [new branch] gh/rtimpe/33/base -> origin/gh/rtimpe/33/base 2025-12-04T11:11:09.6330303Z * [new branch] gh/rtimpe/33/head -> origin/gh/rtimpe/33/head 2025-12-04T11:11:09.6330372Z * [new branch] gh/rtimpe/33/orig -> origin/gh/rtimpe/33/orig 2025-12-04T11:11:09.6330447Z * [new branch] gh/rtimpe/34/base -> origin/gh/rtimpe/34/base 2025-12-04T11:11:09.6330516Z * [new branch] gh/rtimpe/34/head -> origin/gh/rtimpe/34/head 2025-12-04T11:11:09.6330585Z * [new branch] gh/rtimpe/34/orig -> origin/gh/rtimpe/34/orig 2025-12-04T11:11:09.6330656Z * [new branch] gh/rtimpe/35/base -> origin/gh/rtimpe/35/base 2025-12-04T11:11:09.6330724Z * [new branch] gh/rtimpe/35/head -> origin/gh/rtimpe/35/head 2025-12-04T11:11:09.6330795Z * [new branch] gh/rtimpe/35/orig -> origin/gh/rtimpe/35/orig 2025-12-04T11:11:09.6330868Z * [new branch] gh/rtimpe/4/base -> origin/gh/rtimpe/4/base 2025-12-04T11:11:09.6330938Z * [new branch] gh/rtimpe/4/head -> origin/gh/rtimpe/4/head 2025-12-04T11:11:09.6331026Z * [new branch] gh/ruisizhang123/1/base -> origin/gh/ruisizhang123/1/base 2025-12-04T11:11:09.6331112Z * [new branch] gh/ruisizhang123/1/head -> origin/gh/ruisizhang123/1/head 2025-12-04T11:11:09.6331191Z * [new branch] gh/ruisizhang123/1/orig -> origin/gh/ruisizhang123/1/orig 2025-12-04T11:11:09.6331271Z * [new branch] gh/ruisizhang123/4/base -> origin/gh/ruisizhang123/4/base 2025-12-04T11:11:09.6331353Z * [new branch] gh/ruisizhang123/4/head -> origin/gh/ruisizhang123/4/head 2025-12-04T11:11:09.6331430Z * [new branch] gh/ruisizhang123/4/orig -> origin/gh/ruisizhang123/4/orig 2025-12-04T11:11:09.6331510Z * [new branch] gh/ruisizhang123/5/base -> origin/gh/ruisizhang123/5/base 2025-12-04T11:11:09.6331597Z * [new branch] gh/ruisizhang123/5/head -> origin/gh/ruisizhang123/5/head 2025-12-04T11:11:09.6331681Z * [new branch] gh/ruisizhang123/5/orig -> origin/gh/ruisizhang123/5/orig 2025-12-04T11:11:09.6331765Z * [new branch] gh/ruisizhang123/6/base -> origin/gh/ruisizhang123/6/base 2025-12-04T11:11:09.6331844Z * [new branch] gh/ruisizhang123/6/head -> origin/gh/ruisizhang123/6/head 2025-12-04T11:11:09.6331922Z * [new branch] gh/ruisizhang123/6/orig -> origin/gh/ruisizhang123/6/orig 2025-12-04T11:11:09.6332004Z * [new branch] gh/ruisizhang123/7/base -> origin/gh/ruisizhang123/7/base 2025-12-04T11:11:09.6332081Z * [new branch] gh/ruisizhang123/7/head -> origin/gh/ruisizhang123/7/head 2025-12-04T11:11:09.6332182Z * [new branch] gh/ruisizhang123/7/orig -> origin/gh/ruisizhang123/7/orig 2025-12-04T11:11:09.6332263Z * [new branch] gh/ruisizhang123/8/base -> origin/gh/ruisizhang123/8/base 2025-12-04T11:11:09.6332343Z * [new branch] gh/ruisizhang123/8/head -> origin/gh/ruisizhang123/8/head 2025-12-04T11:11:09.6332446Z * [new branch] gh/ruisizhang123/8/orig -> origin/gh/ruisizhang123/8/orig 2025-12-04T11:11:09.6332526Z * [new branch] gh/ruisizhang123/9/base -> origin/gh/ruisizhang123/9/base 2025-12-04T11:11:09.6332604Z * [new branch] gh/ruisizhang123/9/head -> origin/gh/ruisizhang123/9/head 2025-12-04T11:11:09.6332681Z * [new branch] gh/ruisizhang123/9/orig -> origin/gh/ruisizhang123/9/orig 2025-12-04T11:11:09.6332765Z * [new branch] gh/seemethere/52/base -> origin/gh/seemethere/52/base 2025-12-04T11:11:09.6332843Z * [new branch] gh/seemethere/52/head -> origin/gh/seemethere/52/head 2025-12-04T11:11:09.6332920Z * [new branch] gh/seemethere/52/orig -> origin/gh/seemethere/52/orig 2025-12-04T11:11:09.6333001Z * [new branch] gh/seemethere/53/base -> origin/gh/seemethere/53/base 2025-12-04T11:11:09.6333077Z * [new branch] gh/seemethere/53/head -> origin/gh/seemethere/53/head 2025-12-04T11:11:09.6333155Z * [new branch] gh/seemethere/53/orig -> origin/gh/seemethere/53/orig 2025-12-04T11:11:09.6333232Z * [new branch] gh/seemethere/54/base -> origin/gh/seemethere/54/base 2025-12-04T11:11:09.6333307Z * [new branch] gh/seemethere/54/head -> origin/gh/seemethere/54/head 2025-12-04T11:11:09.6333381Z * [new branch] gh/seemethere/54/orig -> origin/gh/seemethere/54/orig 2025-12-04T11:11:09.6333459Z * [new branch] gh/seemethere/55/base -> origin/gh/seemethere/55/base 2025-12-04T11:11:09.6333534Z * [new branch] gh/seemethere/55/head -> origin/gh/seemethere/55/head 2025-12-04T11:11:09.6333615Z * [new branch] gh/seemethere/55/orig -> origin/gh/seemethere/55/orig 2025-12-04T11:11:09.6333691Z * [new branch] gh/seemethere/59/base -> origin/gh/seemethere/59/base 2025-12-04T11:11:09.6333769Z * [new branch] gh/seemethere/59/head -> origin/gh/seemethere/59/head 2025-12-04T11:11:09.6333847Z * [new branch] gh/seemethere/59/orig -> origin/gh/seemethere/59/orig 2025-12-04T11:11:09.6333923Z * [new branch] gh/seemethere/62/base -> origin/gh/seemethere/62/base 2025-12-04T11:11:09.6333999Z * [new branch] gh/seemethere/62/head -> origin/gh/seemethere/62/head 2025-12-04T11:11:09.6334078Z * [new branch] gh/seemethere/62/orig -> origin/gh/seemethere/62/orig 2025-12-04T11:11:09.6334155Z * [new branch] gh/seemethere/63/base -> origin/gh/seemethere/63/base 2025-12-04T11:11:09.6334231Z * [new branch] gh/seemethere/63/head -> origin/gh/seemethere/63/head 2025-12-04T11:11:09.6334313Z * [new branch] gh/seemethere/63/orig -> origin/gh/seemethere/63/orig 2025-12-04T11:11:09.6334388Z * [new branch] gh/seemethere/71/base -> origin/gh/seemethere/71/base 2025-12-04T11:11:09.6334465Z * [new branch] gh/seemethere/71/head -> origin/gh/seemethere/71/head 2025-12-04T11:11:09.6334546Z * [new branch] gh/seemethere/71/orig -> origin/gh/seemethere/71/orig 2025-12-04T11:11:09.6334621Z * [new branch] gh/seemethere/72/base -> origin/gh/seemethere/72/base 2025-12-04T11:11:09.6334696Z * [new branch] gh/seemethere/72/head -> origin/gh/seemethere/72/head 2025-12-04T11:11:09.6334774Z * [new branch] gh/seemethere/72/orig -> origin/gh/seemethere/72/orig 2025-12-04T11:11:09.6334850Z * [new branch] gh/seemethere/73/base -> origin/gh/seemethere/73/base 2025-12-04T11:11:09.6334926Z * [new branch] gh/seemethere/73/head -> origin/gh/seemethere/73/head 2025-12-04T11:11:09.6335026Z * [new branch] gh/seemethere/73/orig -> origin/gh/seemethere/73/orig 2025-12-04T11:11:09.6335102Z * [new branch] gh/seemethere/74/base -> origin/gh/seemethere/74/base 2025-12-04T11:11:09.6335203Z * [new branch] gh/seemethere/74/head -> origin/gh/seemethere/74/head 2025-12-04T11:11:09.6335279Z * [new branch] gh/seemethere/74/orig -> origin/gh/seemethere/74/orig 2025-12-04T11:11:09.6335354Z * [new branch] gh/seemethere/75/base -> origin/gh/seemethere/75/base 2025-12-04T11:11:09.6335432Z * [new branch] gh/seemethere/75/head -> origin/gh/seemethere/75/head 2025-12-04T11:11:09.6335509Z * [new branch] gh/seemethere/75/orig -> origin/gh/seemethere/75/orig 2025-12-04T11:11:09.6335585Z * [new branch] gh/seemethere/76/base -> origin/gh/seemethere/76/base 2025-12-04T11:11:09.6335665Z * [new branch] gh/seemethere/76/head -> origin/gh/seemethere/76/head 2025-12-04T11:11:09.6335741Z * [new branch] gh/seemethere/76/orig -> origin/gh/seemethere/76/orig 2025-12-04T11:11:09.6335821Z * [new branch] gh/shunting314/145/base -> origin/gh/shunting314/145/base 2025-12-04T11:11:09.6335905Z * [new branch] gh/shunting314/145/head -> origin/gh/shunting314/145/head 2025-12-04T11:11:09.6335983Z * [new branch] gh/shunting314/145/orig -> origin/gh/shunting314/145/orig 2025-12-04T11:11:09.6336062Z * [new branch] gh/shunting314/176/base -> origin/gh/shunting314/176/base 2025-12-04T11:11:09.6336142Z * [new branch] gh/shunting314/176/head -> origin/gh/shunting314/176/head 2025-12-04T11:11:09.6336218Z * [new branch] gh/shunting314/176/orig -> origin/gh/shunting314/176/orig 2025-12-04T11:11:09.6336294Z * [new branch] gh/shunting314/249/base -> origin/gh/shunting314/249/base 2025-12-04T11:11:09.6336375Z * [new branch] gh/shunting314/249/head -> origin/gh/shunting314/249/head 2025-12-04T11:11:09.6336452Z * [new branch] gh/shunting314/249/orig -> origin/gh/shunting314/249/orig 2025-12-04T11:11:09.6336528Z * [new branch] gh/shunting314/253/base -> origin/gh/shunting314/253/base 2025-12-04T11:11:09.6336607Z * [new branch] gh/shunting314/253/head -> origin/gh/shunting314/253/head 2025-12-04T11:11:09.6336684Z * [new branch] gh/shunting314/253/orig -> origin/gh/shunting314/253/orig 2025-12-04T11:11:09.6336906Z * [new branch] gh/shunting314/256/base -> origin/gh/shunting314/256/base 2025-12-04T11:11:09.6336984Z * [new branch] gh/shunting314/256/head -> origin/gh/shunting314/256/head 2025-12-04T11:11:09.6337062Z * [new branch] gh/shunting314/256/orig -> origin/gh/shunting314/256/orig 2025-12-04T11:11:09.6337142Z * [new branch] gh/shunting314/257/base -> origin/gh/shunting314/257/base 2025-12-04T11:11:09.6337219Z * [new branch] gh/shunting314/257/head -> origin/gh/shunting314/257/head 2025-12-04T11:11:09.6337295Z * [new branch] gh/shunting314/257/orig -> origin/gh/shunting314/257/orig 2025-12-04T11:11:09.6337378Z * [new branch] gh/shunting314/258/base -> origin/gh/shunting314/258/base 2025-12-04T11:11:09.6337454Z * [new branch] gh/shunting314/258/head -> origin/gh/shunting314/258/head 2025-12-04T11:11:09.6337531Z * [new branch] gh/shunting314/258/orig -> origin/gh/shunting314/258/orig 2025-12-04T11:11:09.6337611Z * [new branch] gh/shunting314/259/base -> origin/gh/shunting314/259/base 2025-12-04T11:11:09.6337688Z * [new branch] gh/shunting314/259/head -> origin/gh/shunting314/259/head 2025-12-04T11:11:09.6337765Z * [new branch] gh/shunting314/259/orig -> origin/gh/shunting314/259/orig 2025-12-04T11:11:09.6337871Z * [new branch] gh/shunting314/260/base -> origin/gh/shunting314/260/base 2025-12-04T11:11:09.6337949Z * [new branch] gh/shunting314/260/head -> origin/gh/shunting314/260/head 2025-12-04T11:11:09.6338026Z * [new branch] gh/shunting314/260/orig -> origin/gh/shunting314/260/orig 2025-12-04T11:11:09.6338142Z * [new branch] gh/shunting314/261/base -> origin/gh/shunting314/261/base 2025-12-04T11:11:09.6338254Z * [new branch] gh/shunting314/261/head -> origin/gh/shunting314/261/head 2025-12-04T11:11:09.6338331Z * [new branch] gh/shunting314/261/orig -> origin/gh/shunting314/261/orig 2025-12-04T11:11:09.6338412Z * [new branch] gh/shunting314/262/base -> origin/gh/shunting314/262/base 2025-12-04T11:11:09.6338490Z * [new branch] gh/shunting314/262/head -> origin/gh/shunting314/262/head 2025-12-04T11:11:09.6338572Z * [new branch] gh/shunting314/262/orig -> origin/gh/shunting314/262/orig 2025-12-04T11:11:09.6338656Z * [new branch] gh/shunting314/263/base -> origin/gh/shunting314/263/base 2025-12-04T11:11:09.6338734Z * [new branch] gh/shunting314/263/head -> origin/gh/shunting314/263/head 2025-12-04T11:11:09.6338816Z * [new branch] gh/shunting314/263/orig -> origin/gh/shunting314/263/orig 2025-12-04T11:11:09.6338894Z * [new branch] gh/shunting314/264/base -> origin/gh/shunting314/264/base 2025-12-04T11:11:09.6338971Z * [new branch] gh/shunting314/264/head -> origin/gh/shunting314/264/head 2025-12-04T11:11:09.6339050Z * [new branch] gh/shunting314/264/orig -> origin/gh/shunting314/264/orig 2025-12-04T11:11:09.6339126Z * [new branch] gh/shunting314/265/base -> origin/gh/shunting314/265/base 2025-12-04T11:11:09.6339203Z * [new branch] gh/shunting314/265/head -> origin/gh/shunting314/265/head 2025-12-04T11:11:09.6339285Z * [new branch] gh/shunting314/265/orig -> origin/gh/shunting314/265/orig 2025-12-04T11:11:09.6339364Z * [new branch] gh/shunting314/266/base -> origin/gh/shunting314/266/base 2025-12-04T11:11:09.6339441Z * [new branch] gh/shunting314/266/head -> origin/gh/shunting314/266/head 2025-12-04T11:11:09.6339525Z * [new branch] gh/shunting314/266/orig -> origin/gh/shunting314/266/orig 2025-12-04T11:11:09.6339603Z * [new branch] gh/shunting314/267/base -> origin/gh/shunting314/267/base 2025-12-04T11:11:09.6339681Z * [new branch] gh/shunting314/267/head -> origin/gh/shunting314/267/head 2025-12-04T11:11:09.6339762Z * [new branch] gh/shunting314/267/orig -> origin/gh/shunting314/267/orig 2025-12-04T11:11:09.6339840Z * [new branch] gh/shunting314/268/base -> origin/gh/shunting314/268/base 2025-12-04T11:11:09.6339917Z * [new branch] gh/shunting314/268/head -> origin/gh/shunting314/268/head 2025-12-04T11:11:09.6340000Z * [new branch] gh/shunting314/268/orig -> origin/gh/shunting314/268/orig 2025-12-04T11:11:09.6340079Z * [new branch] gh/shunting314/269/base -> origin/gh/shunting314/269/base 2025-12-04T11:11:09.6340157Z * [new branch] gh/shunting314/269/head -> origin/gh/shunting314/269/head 2025-12-04T11:11:09.6340240Z * [new branch] gh/shunting314/269/orig -> origin/gh/shunting314/269/orig 2025-12-04T11:11:09.6340316Z * [new branch] gh/silverguo/1/base -> origin/gh/silverguo/1/base 2025-12-04T11:11:09.6340394Z * [new branch] gh/silverguo/1/head -> origin/gh/silverguo/1/head 2025-12-04T11:11:09.6340470Z * [new branch] gh/silverguo/2/base -> origin/gh/silverguo/2/base 2025-12-04T11:11:09.6340545Z * [new branch] gh/silverguo/2/head -> origin/gh/silverguo/2/head 2025-12-04T11:11:09.6340625Z * [new branch] gh/silverguo/3/base -> origin/gh/silverguo/3/base 2025-12-04T11:11:09.6340725Z * [new branch] gh/silverguo/3/head -> origin/gh/silverguo/3/head 2025-12-04T11:11:09.6340799Z * [new branch] gh/silverguo/4/base -> origin/gh/silverguo/4/base 2025-12-04T11:11:09.6340876Z * [new branch] gh/silverguo/4/head -> origin/gh/silverguo/4/head 2025-12-04T11:11:09.6340978Z * [new branch] gh/slayton58/39/base -> origin/gh/slayton58/39/base 2025-12-04T11:11:09.6341054Z * [new branch] gh/slayton58/39/head -> origin/gh/slayton58/39/head 2025-12-04T11:11:09.6341133Z * [new branch] gh/slayton58/39/orig -> origin/gh/slayton58/39/orig 2025-12-04T11:11:09.6341207Z * [new branch] gh/slayton58/42/base -> origin/gh/slayton58/42/base 2025-12-04T11:11:09.6341280Z * [new branch] gh/slayton58/42/head -> origin/gh/slayton58/42/head 2025-12-04T11:11:09.6341358Z * [new branch] gh/slayton58/42/orig -> origin/gh/slayton58/42/orig 2025-12-04T11:11:09.6341432Z * [new branch] gh/slayton58/43/base -> origin/gh/slayton58/43/base 2025-12-04T11:11:09.6341507Z * [new branch] gh/slayton58/43/head -> origin/gh/slayton58/43/head 2025-12-04T11:11:09.6341583Z * [new branch] gh/slayton58/43/orig -> origin/gh/slayton58/43/orig 2025-12-04T11:11:09.6341658Z * [new branch] gh/slayton58/44/base -> origin/gh/slayton58/44/base 2025-12-04T11:11:09.6341731Z * [new branch] gh/slayton58/44/head -> origin/gh/slayton58/44/head 2025-12-04T11:11:09.6341808Z * [new branch] gh/slayton58/44/orig -> origin/gh/slayton58/44/orig 2025-12-04T11:11:09.6341881Z * [new branch] gh/slayton58/45/base -> origin/gh/slayton58/45/base 2025-12-04T11:11:09.6341959Z * [new branch] gh/slayton58/45/head -> origin/gh/slayton58/45/head 2025-12-04T11:11:09.6342033Z * [new branch] gh/slayton58/45/orig -> origin/gh/slayton58/45/orig 2025-12-04T11:11:09.6342106Z * [new branch] gh/slayton58/46/base -> origin/gh/slayton58/46/base 2025-12-04T11:11:09.6342183Z * [new branch] gh/slayton58/46/head -> origin/gh/slayton58/46/head 2025-12-04T11:11:09.6342255Z * [new branch] gh/slayton58/46/orig -> origin/gh/slayton58/46/orig 2025-12-04T11:11:09.6342331Z * [new branch] gh/slayton58/6/base -> origin/gh/slayton58/6/base 2025-12-04T11:11:09.6342407Z * [new branch] gh/slayton58/6/head -> origin/gh/slayton58/6/head 2025-12-04T11:11:09.6342480Z * [new branch] gh/slayton58/7/base -> origin/gh/slayton58/7/base 2025-12-04T11:11:09.6342552Z * [new branch] gh/slayton58/7/head -> origin/gh/slayton58/7/head 2025-12-04T11:11:09.6342633Z * [new branch] gh/soulitzer/269/base -> origin/gh/soulitzer/269/base 2025-12-04T11:11:09.6342710Z * [new branch] gh/soulitzer/269/head -> origin/gh/soulitzer/269/head 2025-12-04T11:11:09.6342787Z * [new branch] gh/soulitzer/269/orig -> origin/gh/soulitzer/269/orig 2025-12-04T11:11:09.6342865Z * [new branch] gh/soulitzer/276/base -> origin/gh/soulitzer/276/base 2025-12-04T11:11:09.6342940Z * [new branch] gh/soulitzer/276/head -> origin/gh/soulitzer/276/head 2025-12-04T11:11:09.6343017Z * [new branch] gh/soulitzer/276/orig -> origin/gh/soulitzer/276/orig 2025-12-04T11:11:09.6343097Z * [new branch] gh/soulitzer/287/base -> origin/gh/soulitzer/287/base 2025-12-04T11:11:09.6343172Z * [new branch] gh/soulitzer/287/head -> origin/gh/soulitzer/287/head 2025-12-04T11:11:09.6343247Z * [new branch] gh/soulitzer/287/orig -> origin/gh/soulitzer/287/orig 2025-12-04T11:11:09.6343325Z * [new branch] gh/soulitzer/296/base -> origin/gh/soulitzer/296/base 2025-12-04T11:11:09.6343401Z * [new branch] gh/soulitzer/296/head -> origin/gh/soulitzer/296/head 2025-12-04T11:11:09.6343500Z * [new branch] gh/soulitzer/296/orig -> origin/gh/soulitzer/296/orig 2025-12-04T11:11:09.6343581Z * [new branch] gh/soulitzer/299/base -> origin/gh/soulitzer/299/base 2025-12-04T11:11:09.6343678Z * [new branch] gh/soulitzer/299/head -> origin/gh/soulitzer/299/head 2025-12-04T11:11:09.6343756Z * [new branch] gh/soulitzer/299/orig -> origin/gh/soulitzer/299/orig 2025-12-04T11:11:09.6343830Z * [new branch] gh/soulitzer/300/base -> origin/gh/soulitzer/300/base 2025-12-04T11:11:09.6343906Z * [new branch] gh/soulitzer/300/head -> origin/gh/soulitzer/300/head 2025-12-04T11:11:09.6343984Z * [new branch] gh/soulitzer/300/orig -> origin/gh/soulitzer/300/orig 2025-12-04T11:11:09.6344057Z * [new branch] gh/soulitzer/301/base -> origin/gh/soulitzer/301/base 2025-12-04T11:11:09.6344132Z * [new branch] gh/soulitzer/301/head -> origin/gh/soulitzer/301/head 2025-12-04T11:11:09.6344211Z * [new branch] gh/soulitzer/301/orig -> origin/gh/soulitzer/301/orig 2025-12-04T11:11:09.6344286Z * [new branch] gh/soulitzer/313/base -> origin/gh/soulitzer/313/base 2025-12-04T11:11:09.6344362Z * [new branch] gh/soulitzer/313/head -> origin/gh/soulitzer/313/head 2025-12-04T11:11:09.6344440Z * [new branch] gh/soulitzer/313/orig -> origin/gh/soulitzer/313/orig 2025-12-04T11:11:09.6344514Z * [new branch] gh/soulitzer/319/base -> origin/gh/soulitzer/319/base 2025-12-04T11:11:09.6344589Z * [new branch] gh/soulitzer/319/head -> origin/gh/soulitzer/319/head 2025-12-04T11:11:09.6344669Z * [new branch] gh/soulitzer/319/orig -> origin/gh/soulitzer/319/orig 2025-12-04T11:11:09.6344744Z * [new branch] gh/soulitzer/320/base -> origin/gh/soulitzer/320/base 2025-12-04T11:11:09.6344820Z * [new branch] gh/soulitzer/320/head -> origin/gh/soulitzer/320/head 2025-12-04T11:11:09.6344901Z * [new branch] gh/soulitzer/320/orig -> origin/gh/soulitzer/320/orig 2025-12-04T11:11:09.6344976Z * [new branch] gh/soulitzer/336/base -> origin/gh/soulitzer/336/base 2025-12-04T11:11:09.6345053Z * [new branch] gh/soulitzer/336/head -> origin/gh/soulitzer/336/head 2025-12-04T11:11:09.6345133Z * [new branch] gh/soulitzer/336/orig -> origin/gh/soulitzer/336/orig 2025-12-04T11:11:09.6345204Z * [new branch] gh/soulitzer/347/base -> origin/gh/soulitzer/347/base 2025-12-04T11:11:09.6345284Z * [new branch] gh/soulitzer/347/head -> origin/gh/soulitzer/347/head 2025-12-04T11:11:09.6345360Z * [new branch] gh/soulitzer/347/orig -> origin/gh/soulitzer/347/orig 2025-12-04T11:11:09.6345436Z * [new branch] gh/soulitzer/349/base -> origin/gh/soulitzer/349/base 2025-12-04T11:11:09.6345517Z * [new branch] gh/soulitzer/349/head -> origin/gh/soulitzer/349/head 2025-12-04T11:11:09.6345593Z * [new branch] gh/soulitzer/349/orig -> origin/gh/soulitzer/349/orig 2025-12-04T11:11:09.6345668Z * [new branch] gh/soulitzer/350/base -> origin/gh/soulitzer/350/base 2025-12-04T11:11:09.6345749Z * [new branch] gh/soulitzer/350/head -> origin/gh/soulitzer/350/head 2025-12-04T11:11:09.6345824Z * [new branch] gh/soulitzer/350/orig -> origin/gh/soulitzer/350/orig 2025-12-04T11:11:09.6345898Z * [new branch] gh/soulitzer/351/base -> origin/gh/soulitzer/351/base 2025-12-04T11:11:09.6345978Z * [new branch] gh/soulitzer/351/head -> origin/gh/soulitzer/351/head 2025-12-04T11:11:09.6346053Z * [new branch] gh/soulitzer/351/orig -> origin/gh/soulitzer/351/orig 2025-12-04T11:11:09.6346128Z * [new branch] gh/soulitzer/353/base -> origin/gh/soulitzer/353/base 2025-12-04T11:11:09.6346236Z * [new branch] gh/soulitzer/353/head -> origin/gh/soulitzer/353/head 2025-12-04T11:11:09.6346312Z * [new branch] gh/soulitzer/353/orig -> origin/gh/soulitzer/353/orig 2025-12-04T11:11:09.6346390Z * [new branch] gh/soulitzer/358/base -> origin/gh/soulitzer/358/base 2025-12-04T11:11:09.6346489Z * [new branch] gh/soulitzer/358/head -> origin/gh/soulitzer/358/head 2025-12-04T11:11:09.6346564Z * [new branch] gh/soulitzer/358/orig -> origin/gh/soulitzer/358/orig 2025-12-04T11:11:09.6346639Z * [new branch] gh/soulitzer/359/base -> origin/gh/soulitzer/359/base 2025-12-04T11:11:09.6346718Z * [new branch] gh/soulitzer/359/head -> origin/gh/soulitzer/359/head 2025-12-04T11:11:09.6346794Z * [new branch] gh/soulitzer/359/orig -> origin/gh/soulitzer/359/orig 2025-12-04T11:11:09.6346869Z * [new branch] gh/soulitzer/374/base -> origin/gh/soulitzer/374/base 2025-12-04T11:11:09.6346949Z * [new branch] gh/soulitzer/374/head -> origin/gh/soulitzer/374/head 2025-12-04T11:11:09.6347024Z * [new branch] gh/soulitzer/374/orig -> origin/gh/soulitzer/374/orig 2025-12-04T11:11:09.6347104Z * [new branch] gh/soulitzer/375/base -> origin/gh/soulitzer/375/base 2025-12-04T11:11:09.6347180Z * [new branch] gh/soulitzer/375/head -> origin/gh/soulitzer/375/head 2025-12-04T11:11:09.6347255Z * [new branch] gh/soulitzer/375/orig -> origin/gh/soulitzer/375/orig 2025-12-04T11:11:09.6347335Z * [new branch] gh/soulitzer/380/base -> origin/gh/soulitzer/380/base 2025-12-04T11:11:09.6347410Z * [new branch] gh/soulitzer/380/head -> origin/gh/soulitzer/380/head 2025-12-04T11:11:09.6347486Z * [new branch] gh/soulitzer/380/orig -> origin/gh/soulitzer/380/orig 2025-12-04T11:11:09.6347565Z * [new branch] gh/soulitzer/385/base -> origin/gh/soulitzer/385/base 2025-12-04T11:11:09.6347642Z * [new branch] gh/soulitzer/385/head -> origin/gh/soulitzer/385/head 2025-12-04T11:11:09.6347717Z * [new branch] gh/soulitzer/385/orig -> origin/gh/soulitzer/385/orig 2025-12-04T11:11:09.6347798Z * [new branch] gh/soulitzer/386/base -> origin/gh/soulitzer/386/base 2025-12-04T11:11:09.6347872Z * [new branch] gh/soulitzer/386/head -> origin/gh/soulitzer/386/head 2025-12-04T11:11:09.6347946Z * [new branch] gh/soulitzer/386/orig -> origin/gh/soulitzer/386/orig 2025-12-04T11:11:09.6348025Z * [new branch] gh/soulitzer/387/base -> origin/gh/soulitzer/387/base 2025-12-04T11:11:09.6348100Z * [new branch] gh/soulitzer/387/head -> origin/gh/soulitzer/387/head 2025-12-04T11:11:09.6348213Z * [new branch] gh/soulitzer/387/orig -> origin/gh/soulitzer/387/orig 2025-12-04T11:11:09.6348293Z * [new branch] gh/soulitzer/388/base -> origin/gh/soulitzer/388/base 2025-12-04T11:11:09.6348369Z * [new branch] gh/soulitzer/388/head -> origin/gh/soulitzer/388/head 2025-12-04T11:11:09.6348445Z * [new branch] gh/soulitzer/388/orig -> origin/gh/soulitzer/388/orig 2025-12-04T11:11:09.6348525Z * [new branch] gh/soulitzer/389/base -> origin/gh/soulitzer/389/base 2025-12-04T11:11:09.6348600Z * [new branch] gh/soulitzer/389/head -> origin/gh/soulitzer/389/head 2025-12-04T11:11:09.6348677Z * [new branch] gh/soulitzer/389/orig -> origin/gh/soulitzer/389/orig 2025-12-04T11:11:09.6348752Z * [new branch] gh/soulitzer/390/base -> origin/gh/soulitzer/390/base 2025-12-04T11:11:09.6348826Z * [new branch] gh/soulitzer/390/head -> origin/gh/soulitzer/390/head 2025-12-04T11:11:09.6348906Z * [new branch] gh/soulitzer/390/orig -> origin/gh/soulitzer/390/orig 2025-12-04T11:11:09.6348981Z * [new branch] gh/soulitzer/391/base -> origin/gh/soulitzer/391/base 2025-12-04T11:11:09.6349096Z * [new branch] gh/soulitzer/391/head -> origin/gh/soulitzer/391/head 2025-12-04T11:11:09.6349176Z * [new branch] gh/soulitzer/391/orig -> origin/gh/soulitzer/391/orig 2025-12-04T11:11:09.6349281Z * [new branch] gh/soulitzer/392/base -> origin/gh/soulitzer/392/base 2025-12-04T11:11:09.6349356Z * [new branch] gh/soulitzer/392/head -> origin/gh/soulitzer/392/head 2025-12-04T11:11:09.6349433Z * [new branch] gh/soulitzer/392/orig -> origin/gh/soulitzer/392/orig 2025-12-04T11:11:09.6349508Z * [new branch] gh/swolchok/728/next -> origin/gh/swolchok/728/next 2025-12-04T11:11:09.6349583Z * [new branch] gh/swolchok/819/base -> origin/gh/swolchok/819/base 2025-12-04T11:11:09.6349660Z * [new branch] gh/swolchok/819/head -> origin/gh/swolchok/819/head 2025-12-04T11:11:09.6349736Z * [new branch] gh/swolchok/819/orig -> origin/gh/swolchok/819/orig 2025-12-04T11:11:09.6349809Z * [new branch] gh/swolchok/824/base -> origin/gh/swolchok/824/base 2025-12-04T11:11:09.6349887Z * [new branch] gh/swolchok/824/head -> origin/gh/swolchok/824/head 2025-12-04T11:11:09.6349962Z * [new branch] gh/swolchok/824/orig -> origin/gh/swolchok/824/orig 2025-12-04T11:11:09.6350035Z * [new branch] gh/swolchok/829/base -> origin/gh/swolchok/829/base 2025-12-04T11:11:09.6350113Z * [new branch] gh/swolchok/829/head -> origin/gh/swolchok/829/head 2025-12-04T11:11:09.6350187Z * [new branch] gh/swolchok/829/orig -> origin/gh/swolchok/829/orig 2025-12-04T11:11:09.6350266Z * [new branch] gh/swolchok/839/base -> origin/gh/swolchok/839/base 2025-12-04T11:11:09.6350338Z * [new branch] gh/swolchok/839/head -> origin/gh/swolchok/839/head 2025-12-04T11:11:09.6350413Z * [new branch] gh/swolchok/839/orig -> origin/gh/swolchok/839/orig 2025-12-04T11:11:09.6350491Z * [new branch] gh/swolchok/841/base -> origin/gh/swolchok/841/base 2025-12-04T11:11:09.6350564Z * [new branch] gh/swolchok/841/head -> origin/gh/swolchok/841/head 2025-12-04T11:11:09.6350641Z * [new branch] gh/swolchok/841/orig -> origin/gh/swolchok/841/orig 2025-12-04T11:11:09.6350719Z * [new branch] gh/swolchok/842/base -> origin/gh/swolchok/842/base 2025-12-04T11:11:09.6350792Z * [new branch] gh/swolchok/842/head -> origin/gh/swolchok/842/head 2025-12-04T11:11:09.6350865Z * [new branch] gh/swolchok/842/orig -> origin/gh/swolchok/842/orig 2025-12-04T11:11:09.6350941Z * [new branch] gh/swolchok/845/base -> origin/gh/swolchok/845/base 2025-12-04T11:11:09.6351014Z * [new branch] gh/swolchok/845/head -> origin/gh/swolchok/845/head 2025-12-04T11:11:09.6351089Z * [new branch] gh/swolchok/845/orig -> origin/gh/swolchok/845/orig 2025-12-04T11:11:09.6351165Z * [new branch] gh/swolchok/848/base -> origin/gh/swolchok/848/base 2025-12-04T11:11:09.6351239Z * [new branch] gh/swolchok/848/head -> origin/gh/swolchok/848/head 2025-12-04T11:11:09.6351314Z * [new branch] gh/swolchok/848/orig -> origin/gh/swolchok/848/orig 2025-12-04T11:11:09.6351392Z * [new branch] gh/swolchok/856/base -> origin/gh/swolchok/856/base 2025-12-04T11:11:09.6351465Z * [new branch] gh/swolchok/856/head -> origin/gh/swolchok/856/head 2025-12-04T11:11:09.6351538Z * [new branch] gh/swolchok/856/orig -> origin/gh/swolchok/856/orig 2025-12-04T11:11:09.6351615Z * [new branch] gh/swolchok/860/base -> origin/gh/swolchok/860/base 2025-12-04T11:11:09.6351687Z * [new branch] gh/swolchok/860/head -> origin/gh/swolchok/860/head 2025-12-04T11:11:09.6351786Z * [new branch] gh/swolchok/860/orig -> origin/gh/swolchok/860/orig 2025-12-04T11:11:09.6351860Z * [new branch] gh/swolchok/861/base -> origin/gh/swolchok/861/base 2025-12-04T11:11:09.6351932Z * [new branch] gh/swolchok/861/head -> origin/gh/swolchok/861/head 2025-12-04T11:11:09.6352031Z * [new branch] gh/swolchok/861/orig -> origin/gh/swolchok/861/orig 2025-12-04T11:11:09.6352104Z * [new branch] gh/swolchok/862/base -> origin/gh/swolchok/862/base 2025-12-04T11:11:09.6352177Z * [new branch] gh/swolchok/862/head -> origin/gh/swolchok/862/head 2025-12-04T11:11:09.6352251Z * [new branch] gh/swolchok/862/orig -> origin/gh/swolchok/862/orig 2025-12-04T11:11:09.6352323Z * [new branch] gh/swolchok/863/base -> origin/gh/swolchok/863/base 2025-12-04T11:11:09.6352394Z * [new branch] gh/swolchok/863/head -> origin/gh/swolchok/863/head 2025-12-04T11:11:09.6352469Z * [new branch] gh/swolchok/863/orig -> origin/gh/swolchok/863/orig 2025-12-04T11:11:09.6352541Z * [new branch] gh/swolchok/864/base -> origin/gh/swolchok/864/base 2025-12-04T11:11:09.6352614Z * [new branch] gh/swolchok/864/head -> origin/gh/swolchok/864/head 2025-12-04T11:11:09.6352689Z * [new branch] gh/swolchok/864/orig -> origin/gh/swolchok/864/orig 2025-12-04T11:11:09.6352760Z * [new branch] gh/swolchok/865/base -> origin/gh/swolchok/865/base 2025-12-04T11:11:09.6352832Z * [new branch] gh/swolchok/865/head -> origin/gh/swolchok/865/head 2025-12-04T11:11:09.6352908Z * [new branch] gh/swolchok/865/orig -> origin/gh/swolchok/865/orig 2025-12-04T11:11:09.6352980Z * [new branch] gh/swolchok/866/base -> origin/gh/swolchok/866/base 2025-12-04T11:11:09.6353052Z * [new branch] gh/swolchok/866/head -> origin/gh/swolchok/866/head 2025-12-04T11:11:09.6353127Z * [new branch] gh/swolchok/866/orig -> origin/gh/swolchok/866/orig 2025-12-04T11:11:09.6353201Z * [new branch] gh/swolchok/867/base -> origin/gh/swolchok/867/base 2025-12-04T11:11:09.6353276Z * [new branch] gh/swolchok/867/head -> origin/gh/swolchok/867/head 2025-12-04T11:11:09.6353350Z * [new branch] gh/swolchok/867/orig -> origin/gh/swolchok/867/orig 2025-12-04T11:11:09.6353423Z * [new branch] gh/swolchok/868/base -> origin/gh/swolchok/868/base 2025-12-04T11:11:09.6353497Z * [new branch] gh/swolchok/868/head -> origin/gh/swolchok/868/head 2025-12-04T11:11:09.6353569Z * [new branch] gh/swolchok/868/orig -> origin/gh/swolchok/868/orig 2025-12-04T11:11:09.6353642Z * [new branch] gh/swolchok/869/base -> origin/gh/swolchok/869/base 2025-12-04T11:11:09.6353716Z * [new branch] gh/swolchok/869/head -> origin/gh/swolchok/869/head 2025-12-04T11:11:09.6353788Z * [new branch] gh/swolchok/869/orig -> origin/gh/swolchok/869/orig 2025-12-04T11:11:09.6353859Z * [new branch] gh/swolchok/870/base -> origin/gh/swolchok/870/base 2025-12-04T11:11:09.6353936Z * [new branch] gh/swolchok/870/head -> origin/gh/swolchok/870/head 2025-12-04T11:11:09.6354007Z * [new branch] gh/swolchok/870/orig -> origin/gh/swolchok/870/orig 2025-12-04T11:11:09.6354079Z * [new branch] gh/swolchok/871/base -> origin/gh/swolchok/871/base 2025-12-04T11:11:09.6354154Z * [new branch] gh/swolchok/871/head -> origin/gh/swolchok/871/head 2025-12-04T11:11:09.6354226Z * [new branch] gh/swolchok/871/orig -> origin/gh/swolchok/871/orig 2025-12-04T11:11:09.6354300Z * [new branch] gh/teja-rao/4/base -> origin/gh/teja-rao/4/base 2025-12-04T11:11:09.6354380Z * [new branch] gh/teja-rao/4/head -> origin/gh/teja-rao/4/head 2025-12-04T11:11:09.6354473Z * [new branch] gh/teja-rao/4/orig -> origin/gh/teja-rao/4/orig 2025-12-04T11:11:09.6354547Z * [new branch] gh/tianyu-l/2/base -> origin/gh/tianyu-l/2/base 2025-12-04T11:11:09.6354645Z * [new branch] gh/tianyu-l/2/head -> origin/gh/tianyu-l/2/head 2025-12-04T11:11:09.6354715Z * [new branch] gh/tianyu-l/2/orig -> origin/gh/tianyu-l/2/orig 2025-12-04T11:11:09.6354784Z * [new branch] gh/tianyu-l/3/base -> origin/gh/tianyu-l/3/base 2025-12-04T11:11:09.6354856Z * [new branch] gh/tianyu-l/3/orig -> origin/gh/tianyu-l/3/orig 2025-12-04T11:11:09.6354925Z * [new branch] gh/tianyu-l/4/base -> origin/gh/tianyu-l/4/base 2025-12-04T11:11:09.6354996Z * [new branch] gh/tianyu-l/4/head -> origin/gh/tianyu-l/4/head 2025-12-04T11:11:09.6355065Z * [new branch] gh/tianyu-l/4/orig -> origin/gh/tianyu-l/4/orig 2025-12-04T11:11:09.6355158Z * [new branch] gh/tugsbayasgalan/10/base -> origin/gh/tugsbayasgalan/10/base 2025-12-04T11:11:09.6355250Z * [new branch] gh/tugsbayasgalan/10/head -> origin/gh/tugsbayasgalan/10/head 2025-12-04T11:11:09.6355339Z * [new branch] gh/tugsbayasgalan/10/orig -> origin/gh/tugsbayasgalan/10/orig 2025-12-04T11:11:09.6355424Z * [new branch] gh/tugsbayasgalan/13/base -> origin/gh/tugsbayasgalan/13/base 2025-12-04T11:11:09.6355513Z * [new branch] gh/tugsbayasgalan/13/head -> origin/gh/tugsbayasgalan/13/head 2025-12-04T11:11:09.6355598Z * [new branch] gh/tugsbayasgalan/13/orig -> origin/gh/tugsbayasgalan/13/orig 2025-12-04T11:11:09.6355684Z * [new branch] gh/tugsbayasgalan/17/base -> origin/gh/tugsbayasgalan/17/base 2025-12-04T11:11:09.6355772Z * [new branch] gh/tugsbayasgalan/17/head -> origin/gh/tugsbayasgalan/17/head 2025-12-04T11:11:09.6355858Z * [new branch] gh/tugsbayasgalan/17/orig -> origin/gh/tugsbayasgalan/17/orig 2025-12-04T11:11:09.6355945Z * [new branch] gh/tugsbayasgalan/2/base -> origin/gh/tugsbayasgalan/2/base 2025-12-04T11:11:09.6356031Z * [new branch] gh/tugsbayasgalan/2/head -> origin/gh/tugsbayasgalan/2/head 2025-12-04T11:11:09.6356117Z * [new branch] gh/tugsbayasgalan/2/orig -> origin/gh/tugsbayasgalan/2/orig 2025-12-04T11:11:09.6356203Z * [new branch] gh/tugsbayasgalan/28/base -> origin/gh/tugsbayasgalan/28/base 2025-12-04T11:11:09.6356292Z * [new branch] gh/tugsbayasgalan/28/head -> origin/gh/tugsbayasgalan/28/head 2025-12-04T11:11:09.6356377Z * [new branch] gh/tugsbayasgalan/28/orig -> origin/gh/tugsbayasgalan/28/orig 2025-12-04T11:11:09.6356462Z * [new branch] gh/tugsbayasgalan/32/base -> origin/gh/tugsbayasgalan/32/base 2025-12-04T11:11:09.6356550Z * [new branch] gh/tugsbayasgalan/32/head -> origin/gh/tugsbayasgalan/32/head 2025-12-04T11:11:09.6356637Z * [new branch] gh/tugsbayasgalan/32/orig -> origin/gh/tugsbayasgalan/32/orig 2025-12-04T11:11:09.6356726Z * [new branch] gh/tugsbayasgalan/35/base -> origin/gh/tugsbayasgalan/35/base 2025-12-04T11:11:09.6356816Z * [new branch] gh/tugsbayasgalan/35/head -> origin/gh/tugsbayasgalan/35/head 2025-12-04T11:11:09.6356903Z * [new branch] gh/tugsbayasgalan/35/orig -> origin/gh/tugsbayasgalan/35/orig 2025-12-04T11:11:09.6356993Z * [new branch] gh/tugsbayasgalan/36/base -> origin/gh/tugsbayasgalan/36/base 2025-12-04T11:11:09.6357077Z * [new branch] gh/tugsbayasgalan/36/head -> origin/gh/tugsbayasgalan/36/head 2025-12-04T11:11:09.6357162Z * [new branch] gh/tugsbayasgalan/36/orig -> origin/gh/tugsbayasgalan/36/orig 2025-12-04T11:11:09.6357250Z * [new branch] gh/tugsbayasgalan/37/base -> origin/gh/tugsbayasgalan/37/base 2025-12-04T11:11:09.6357358Z * [new branch] gh/tugsbayasgalan/37/head -> origin/gh/tugsbayasgalan/37/head 2025-12-04T11:11:09.6357444Z * [new branch] gh/tugsbayasgalan/37/orig -> origin/gh/tugsbayasgalan/37/orig 2025-12-04T11:11:09.6357554Z * [new branch] gh/tugsbayasgalan/43/base -> origin/gh/tugsbayasgalan/43/base 2025-12-04T11:11:09.6357639Z * [new branch] gh/tugsbayasgalan/43/head -> origin/gh/tugsbayasgalan/43/head 2025-12-04T11:11:09.6357723Z * [new branch] gh/tugsbayasgalan/43/orig -> origin/gh/tugsbayasgalan/43/orig 2025-12-04T11:11:09.6357812Z * [new branch] gh/tugsbayasgalan/48/base -> origin/gh/tugsbayasgalan/48/base 2025-12-04T11:11:09.6357896Z * [new branch] gh/tugsbayasgalan/48/head -> origin/gh/tugsbayasgalan/48/head 2025-12-04T11:11:09.6357981Z * [new branch] gh/tugsbayasgalan/48/orig -> origin/gh/tugsbayasgalan/48/orig 2025-12-04T11:11:09.6358074Z * [new branch] gh/tugsbayasgalan/51/base -> origin/gh/tugsbayasgalan/51/base 2025-12-04T11:11:09.6358200Z * [new branch] gh/tugsbayasgalan/51/head -> origin/gh/tugsbayasgalan/51/head 2025-12-04T11:11:09.6358285Z * [new branch] gh/tugsbayasgalan/51/orig -> origin/gh/tugsbayasgalan/51/orig 2025-12-04T11:11:09.6358375Z * [new branch] gh/tugsbayasgalan/52/base -> origin/gh/tugsbayasgalan/52/base 2025-12-04T11:11:09.6358461Z * [new branch] gh/tugsbayasgalan/52/head -> origin/gh/tugsbayasgalan/52/head 2025-12-04T11:11:09.6358551Z * [new branch] gh/tugsbayasgalan/52/orig -> origin/gh/tugsbayasgalan/52/orig 2025-12-04T11:11:09.6358636Z * [new branch] gh/tugsbayasgalan/53/base -> origin/gh/tugsbayasgalan/53/base 2025-12-04T11:11:09.6358721Z * [new branch] gh/tugsbayasgalan/53/head -> origin/gh/tugsbayasgalan/53/head 2025-12-04T11:11:09.6358809Z * [new branch] gh/tugsbayasgalan/53/orig -> origin/gh/tugsbayasgalan/53/orig 2025-12-04T11:11:09.6358895Z * [new branch] gh/tugsbayasgalan/55/base -> origin/gh/tugsbayasgalan/55/base 2025-12-04T11:11:09.6358981Z * [new branch] gh/tugsbayasgalan/55/head -> origin/gh/tugsbayasgalan/55/head 2025-12-04T11:11:09.6359070Z * [new branch] gh/tugsbayasgalan/55/orig -> origin/gh/tugsbayasgalan/55/orig 2025-12-04T11:11:09.6359155Z * [new branch] gh/tugsbayasgalan/59/base -> origin/gh/tugsbayasgalan/59/base 2025-12-04T11:11:09.6359240Z * [new branch] gh/tugsbayasgalan/59/head -> origin/gh/tugsbayasgalan/59/head 2025-12-04T11:11:09.6359325Z * [new branch] gh/tugsbayasgalan/59/orig -> origin/gh/tugsbayasgalan/59/orig 2025-12-04T11:11:09.6359409Z * [new branch] gh/tugsbayasgalan/6/base -> origin/gh/tugsbayasgalan/6/base 2025-12-04T11:11:09.6359492Z * [new branch] gh/tugsbayasgalan/6/head -> origin/gh/tugsbayasgalan/6/head 2025-12-04T11:11:09.6359579Z * [new branch] gh/tugsbayasgalan/6/orig -> origin/gh/tugsbayasgalan/6/orig 2025-12-04T11:11:09.6359664Z * [new branch] gh/tugsbayasgalan/60/base -> origin/gh/tugsbayasgalan/60/base 2025-12-04T11:11:09.6359748Z * [new branch] gh/tugsbayasgalan/60/head -> origin/gh/tugsbayasgalan/60/head 2025-12-04T11:11:09.6359840Z * [new branch] gh/tugsbayasgalan/60/orig -> origin/gh/tugsbayasgalan/60/orig 2025-12-04T11:11:09.6359925Z * [new branch] gh/tugsbayasgalan/61/base -> origin/gh/tugsbayasgalan/61/base 2025-12-04T11:11:09.6360014Z * [new branch] gh/tugsbayasgalan/61/head -> origin/gh/tugsbayasgalan/61/head 2025-12-04T11:11:09.6360099Z * [new branch] gh/tugsbayasgalan/61/orig -> origin/gh/tugsbayasgalan/61/orig 2025-12-04T11:11:09.6360183Z * [new branch] gh/tugsbayasgalan/63/base -> origin/gh/tugsbayasgalan/63/base 2025-12-04T11:11:09.6360271Z * [new branch] gh/tugsbayasgalan/63/head -> origin/gh/tugsbayasgalan/63/head 2025-12-04T11:11:09.6360390Z * [new branch] gh/tugsbayasgalan/63/orig -> origin/gh/tugsbayasgalan/63/orig 2025-12-04T11:11:09.6360476Z * [new branch] gh/tugsbayasgalan/67/base -> origin/gh/tugsbayasgalan/67/base 2025-12-04T11:11:09.6360591Z * [new branch] gh/tugsbayasgalan/67/head -> origin/gh/tugsbayasgalan/67/head 2025-12-04T11:11:09.6360676Z * [new branch] gh/tugsbayasgalan/67/orig -> origin/gh/tugsbayasgalan/67/orig 2025-12-04T11:11:09.6360761Z * [new branch] gh/tugsbayasgalan/68/base -> origin/gh/tugsbayasgalan/68/base 2025-12-04T11:11:09.6360849Z * [new branch] gh/tugsbayasgalan/68/head -> origin/gh/tugsbayasgalan/68/head 2025-12-04T11:11:09.6360934Z * [new branch] gh/tugsbayasgalan/68/orig -> origin/gh/tugsbayasgalan/68/orig 2025-12-04T11:11:09.6361018Z * [new branch] gh/tugsbayasgalan/7/base -> origin/gh/tugsbayasgalan/7/base 2025-12-04T11:11:09.6361108Z * [new branch] gh/tugsbayasgalan/7/head -> origin/gh/tugsbayasgalan/7/head 2025-12-04T11:11:09.6361191Z * [new branch] gh/tugsbayasgalan/7/orig -> origin/gh/tugsbayasgalan/7/orig 2025-12-04T11:11:09.6361277Z * [new branch] gh/tugsbayasgalan/70/base -> origin/gh/tugsbayasgalan/70/base 2025-12-04T11:11:09.6361368Z * [new branch] gh/tugsbayasgalan/70/head -> origin/gh/tugsbayasgalan/70/head 2025-12-04T11:11:09.6361455Z * [new branch] gh/tugsbayasgalan/70/orig -> origin/gh/tugsbayasgalan/70/orig 2025-12-04T11:11:09.6361541Z * [new branch] gh/tugsbayasgalan/71/base -> origin/gh/tugsbayasgalan/71/base 2025-12-04T11:11:09.6361633Z * [new branch] gh/tugsbayasgalan/71/head -> origin/gh/tugsbayasgalan/71/head 2025-12-04T11:11:09.6361718Z * [new branch] gh/tugsbayasgalan/71/orig -> origin/gh/tugsbayasgalan/71/orig 2025-12-04T11:11:09.6361806Z * [new branch] gh/tugsbayasgalan/72/base -> origin/gh/tugsbayasgalan/72/base 2025-12-04T11:11:09.6361892Z * [new branch] gh/tugsbayasgalan/72/head -> origin/gh/tugsbayasgalan/72/head 2025-12-04T11:11:09.6361977Z * [new branch] gh/tugsbayasgalan/72/orig -> origin/gh/tugsbayasgalan/72/orig 2025-12-04T11:11:09.6362069Z * [new branch] gh/tugsbayasgalan/73/base -> origin/gh/tugsbayasgalan/73/base 2025-12-04T11:11:09.6362153Z * [new branch] gh/tugsbayasgalan/73/head -> origin/gh/tugsbayasgalan/73/head 2025-12-04T11:11:09.6362238Z * [new branch] gh/tugsbayasgalan/73/orig -> origin/gh/tugsbayasgalan/73/orig 2025-12-04T11:11:09.6362327Z * [new branch] gh/tugsbayasgalan/74/base -> origin/gh/tugsbayasgalan/74/base 2025-12-04T11:11:09.6362411Z * [new branch] gh/tugsbayasgalan/74/head -> origin/gh/tugsbayasgalan/74/head 2025-12-04T11:11:09.6362495Z * [new branch] gh/tugsbayasgalan/74/orig -> origin/gh/tugsbayasgalan/74/orig 2025-12-04T11:11:09.6362584Z * [new branch] gh/tugsbayasgalan/75/base -> origin/gh/tugsbayasgalan/75/base 2025-12-04T11:11:09.6362668Z * [new branch] gh/tugsbayasgalan/75/head -> origin/gh/tugsbayasgalan/75/head 2025-12-04T11:11:09.6362754Z * [new branch] gh/tugsbayasgalan/75/orig -> origin/gh/tugsbayasgalan/75/orig 2025-12-04T11:11:09.6362843Z * [new branch] gh/tugsbayasgalan/76/base -> origin/gh/tugsbayasgalan/76/base 2025-12-04T11:11:09.6362930Z * [new branch] gh/tugsbayasgalan/76/head -> origin/gh/tugsbayasgalan/76/head 2025-12-04T11:11:09.6363015Z * [new branch] gh/tugsbayasgalan/76/orig -> origin/gh/tugsbayasgalan/76/orig 2025-12-04T11:11:09.6363104Z * [new branch] gh/tugsbayasgalan/77/base -> origin/gh/tugsbayasgalan/77/base 2025-12-04T11:11:09.6363189Z * [new branch] gh/tugsbayasgalan/77/head -> origin/gh/tugsbayasgalan/77/head 2025-12-04T11:11:09.6363301Z * [new branch] gh/tugsbayasgalan/77/orig -> origin/gh/tugsbayasgalan/77/orig 2025-12-04T11:11:09.6363386Z * [new branch] gh/tugsbayasgalan/78/base -> origin/gh/tugsbayasgalan/78/base 2025-12-04T11:11:09.6363471Z * [new branch] gh/tugsbayasgalan/78/head -> origin/gh/tugsbayasgalan/78/head 2025-12-04T11:11:09.6363584Z * [new branch] gh/tugsbayasgalan/78/orig -> origin/gh/tugsbayasgalan/78/orig 2025-12-04T11:11:09.6363669Z * [new branch] gh/tugsbayasgalan/79/base -> origin/gh/tugsbayasgalan/79/base 2025-12-04T11:11:09.6363755Z * [new branch] gh/tugsbayasgalan/79/head -> origin/gh/tugsbayasgalan/79/head 2025-12-04T11:11:09.6363844Z * [new branch] gh/tugsbayasgalan/79/orig -> origin/gh/tugsbayasgalan/79/orig 2025-12-04T11:11:09.6363927Z * [new branch] gh/tugsbayasgalan/8/base -> origin/gh/tugsbayasgalan/8/base 2025-12-04T11:11:09.6364011Z * [new branch] gh/tugsbayasgalan/8/head -> origin/gh/tugsbayasgalan/8/head 2025-12-04T11:11:09.6364101Z * [new branch] gh/tugsbayasgalan/8/orig -> origin/gh/tugsbayasgalan/8/orig 2025-12-04T11:11:09.6364186Z * [new branch] gh/tugsbayasgalan/80/base -> origin/gh/tugsbayasgalan/80/base 2025-12-04T11:11:09.6364274Z * [new branch] gh/tugsbayasgalan/80/head -> origin/gh/tugsbayasgalan/80/head 2025-12-04T11:11:09.6364365Z * [new branch] gh/tugsbayasgalan/80/orig -> origin/gh/tugsbayasgalan/80/orig 2025-12-04T11:11:09.6364450Z * [new branch] gh/tugsbayasgalan/81/base -> origin/gh/tugsbayasgalan/81/base 2025-12-04T11:11:09.6364535Z * [new branch] gh/tugsbayasgalan/81/head -> origin/gh/tugsbayasgalan/81/head 2025-12-04T11:11:09.6364624Z * [new branch] gh/tugsbayasgalan/81/orig -> origin/gh/tugsbayasgalan/81/orig 2025-12-04T11:11:09.6364708Z * [new branch] gh/tugsbayasgalan/82/base -> origin/gh/tugsbayasgalan/82/base 2025-12-04T11:11:09.6364794Z * [new branch] gh/tugsbayasgalan/82/head -> origin/gh/tugsbayasgalan/82/head 2025-12-04T11:11:09.6364883Z * [new branch] gh/tugsbayasgalan/82/orig -> origin/gh/tugsbayasgalan/82/orig 2025-12-04T11:11:09.6364968Z * [new branch] gh/tugsbayasgalan/83/base -> origin/gh/tugsbayasgalan/83/base 2025-12-04T11:11:09.6365057Z * [new branch] gh/tugsbayasgalan/83/head -> origin/gh/tugsbayasgalan/83/head 2025-12-04T11:11:09.6365142Z * [new branch] gh/tugsbayasgalan/83/orig -> origin/gh/tugsbayasgalan/83/orig 2025-12-04T11:11:09.6365226Z * [new branch] gh/tugsbayasgalan/84/base -> origin/gh/tugsbayasgalan/84/base 2025-12-04T11:11:09.6365316Z * [new branch] gh/tugsbayasgalan/84/head -> origin/gh/tugsbayasgalan/84/head 2025-12-04T11:11:09.6365400Z * [new branch] gh/tugsbayasgalan/84/orig -> origin/gh/tugsbayasgalan/84/orig 2025-12-04T11:11:09.6365485Z * [new branch] gh/tugsbayasgalan/85/base -> origin/gh/tugsbayasgalan/85/base 2025-12-04T11:11:09.6365576Z * [new branch] gh/tugsbayasgalan/85/head -> origin/gh/tugsbayasgalan/85/head 2025-12-04T11:11:09.6365660Z * [new branch] gh/tugsbayasgalan/85/orig -> origin/gh/tugsbayasgalan/85/orig 2025-12-04T11:11:09.6365745Z * [new branch] gh/tugsbayasgalan/86/base -> origin/gh/tugsbayasgalan/86/base 2025-12-04T11:11:09.6365833Z * [new branch] gh/tugsbayasgalan/86/head -> origin/gh/tugsbayasgalan/86/head 2025-12-04T11:11:09.6365917Z * [new branch] gh/tugsbayasgalan/86/orig -> origin/gh/tugsbayasgalan/86/orig 2025-12-04T11:11:09.6366001Z * [new branch] gh/tugsbayasgalan/87/base -> origin/gh/tugsbayasgalan/87/base 2025-12-04T11:11:09.6366087Z * [new branch] gh/tugsbayasgalan/87/head -> origin/gh/tugsbayasgalan/87/head 2025-12-04T11:11:09.6366172Z * [new branch] gh/tugsbayasgalan/87/orig -> origin/gh/tugsbayasgalan/87/orig 2025-12-04T11:11:09.6366276Z * [new branch] gh/tugsbayasgalan/88/base -> origin/gh/tugsbayasgalan/88/base 2025-12-04T11:11:09.6366366Z * [new branch] gh/tugsbayasgalan/88/head -> origin/gh/tugsbayasgalan/88/head 2025-12-04T11:11:09.6366473Z * [new branch] gh/tugsbayasgalan/88/orig -> origin/gh/tugsbayasgalan/88/orig 2025-12-04T11:11:09.6366561Z * [new branch] gh/tugsbayasgalan/89/base -> origin/gh/tugsbayasgalan/89/base 2025-12-04T11:11:09.6366646Z * [new branch] gh/tugsbayasgalan/89/head -> origin/gh/tugsbayasgalan/89/head 2025-12-04T11:11:09.6366731Z * [new branch] gh/tugsbayasgalan/89/orig -> origin/gh/tugsbayasgalan/89/orig 2025-12-04T11:11:09.6366819Z * [new branch] gh/tugsbayasgalan/9/base -> origin/gh/tugsbayasgalan/9/base 2025-12-04T11:11:09.6366902Z * [new branch] gh/tugsbayasgalan/9/head -> origin/gh/tugsbayasgalan/9/head 2025-12-04T11:11:09.6366987Z * [new branch] gh/tugsbayasgalan/9/orig -> origin/gh/tugsbayasgalan/9/orig 2025-12-04T11:11:09.6367078Z * [new branch] gh/tugsbayasgalan/90/base -> origin/gh/tugsbayasgalan/90/base 2025-12-04T11:11:09.6367164Z * [new branch] gh/tugsbayasgalan/90/head -> origin/gh/tugsbayasgalan/90/head 2025-12-04T11:11:09.6367253Z * [new branch] gh/tugsbayasgalan/90/orig -> origin/gh/tugsbayasgalan/90/orig 2025-12-04T11:11:09.6367344Z * [new branch] gh/tugsbayasgalan/91/base -> origin/gh/tugsbayasgalan/91/base 2025-12-04T11:11:09.6367429Z * [new branch] gh/tugsbayasgalan/91/head -> origin/gh/tugsbayasgalan/91/head 2025-12-04T11:11:09.6367514Z * [new branch] gh/tugsbayasgalan/91/orig -> origin/gh/tugsbayasgalan/91/orig 2025-12-04T11:11:09.6367603Z * [new branch] gh/tugsbayasgalan/92/base -> origin/gh/tugsbayasgalan/92/base 2025-12-04T11:11:09.6367687Z * [new branch] gh/tugsbayasgalan/92/head -> origin/gh/tugsbayasgalan/92/head 2025-12-04T11:11:09.6367773Z * [new branch] gh/tugsbayasgalan/92/orig -> origin/gh/tugsbayasgalan/92/orig 2025-12-04T11:11:09.6367863Z * [new branch] gh/tugsbayasgalan/93/base -> origin/gh/tugsbayasgalan/93/base 2025-12-04T11:11:09.6367948Z * [new branch] gh/tugsbayasgalan/93/head -> origin/gh/tugsbayasgalan/93/head 2025-12-04T11:11:09.6368033Z * [new branch] gh/tugsbayasgalan/93/orig -> origin/gh/tugsbayasgalan/93/orig 2025-12-04T11:11:09.6368108Z * [new branch] gh/v0i0/14/base -> origin/gh/v0i0/14/base 2025-12-04T11:11:09.6368207Z * [new branch] gh/v0i0/14/head -> origin/gh/v0i0/14/head 2025-12-04T11:11:09.6368279Z * [new branch] gh/v0i0/14/orig -> origin/gh/v0i0/14/orig 2025-12-04T11:11:09.6368346Z * [new branch] gh/v0i0/15/base -> origin/gh/v0i0/15/base 2025-12-04T11:11:09.6368410Z * [new branch] gh/v0i0/15/head -> origin/gh/v0i0/15/head 2025-12-04T11:11:09.6368481Z * [new branch] gh/v0i0/15/orig -> origin/gh/v0i0/15/orig 2025-12-04T11:11:09.6368548Z * [new branch] gh/v0i0/16/base -> origin/gh/v0i0/16/base 2025-12-04T11:11:09.6368614Z * [new branch] gh/v0i0/16/head -> origin/gh/v0i0/16/head 2025-12-04T11:11:09.6368684Z * [new branch] gh/v0i0/16/orig -> origin/gh/v0i0/16/orig 2025-12-04T11:11:09.6368751Z * [new branch] gh/v0i0/17/base -> origin/gh/v0i0/17/base 2025-12-04T11:11:09.6368817Z * [new branch] gh/v0i0/17/head -> origin/gh/v0i0/17/head 2025-12-04T11:11:09.6368884Z * [new branch] gh/v0i0/17/orig -> origin/gh/v0i0/17/orig 2025-12-04T11:11:09.6368949Z * [new branch] gh/v0i0/18/base -> origin/gh/v0i0/18/base 2025-12-04T11:11:09.6369013Z * [new branch] gh/v0i0/18/head -> origin/gh/v0i0/18/head 2025-12-04T11:11:09.6369106Z * [new branch] gh/v0i0/18/orig -> origin/gh/v0i0/18/orig 2025-12-04T11:11:09.6369172Z * [new branch] gh/v0i0/19/base -> origin/gh/v0i0/19/base 2025-12-04T11:11:09.6369238Z * [new branch] gh/v0i0/19/head -> origin/gh/v0i0/19/head 2025-12-04T11:11:09.6369342Z * [new branch] gh/v0i0/19/orig -> origin/gh/v0i0/19/orig 2025-12-04T11:11:09.6369426Z * [new branch] gh/vishal9-team/1/base -> origin/gh/vishal9-team/1/base 2025-12-04T11:11:09.6369504Z * [new branch] gh/vishal9-team/1/head -> origin/gh/vishal9-team/1/head 2025-12-04T11:11:09.6369583Z * [new branch] gh/vishal9-team/2/base -> origin/gh/vishal9-team/2/base 2025-12-04T11:11:09.6369659Z * [new branch] gh/vishal9-team/2/head -> origin/gh/vishal9-team/2/head 2025-12-04T11:11:09.6369735Z * [new branch] gh/vishal9-team/2/orig -> origin/gh/vishal9-team/2/orig 2025-12-04T11:11:09.6369814Z * [new branch] gh/vishal9-team/3/base -> origin/gh/vishal9-team/3/base 2025-12-04T11:11:09.6369890Z * [new branch] gh/vishal9-team/3/head -> origin/gh/vishal9-team/3/head 2025-12-04T11:11:09.6369967Z * [new branch] gh/vishal9-team/3/orig -> origin/gh/vishal9-team/3/orig 2025-12-04T11:11:09.6370044Z * [new branch] gh/vishal9-team/4/base -> origin/gh/vishal9-team/4/base 2025-12-04T11:11:09.6370120Z * [new branch] gh/vishal9-team/4/head -> origin/gh/vishal9-team/4/head 2025-12-04T11:11:09.6370197Z * [new branch] gh/vishal9-team/4/orig -> origin/gh/vishal9-team/4/orig 2025-12-04T11:11:09.6370264Z * [new branch] gh/vkuzo/1/next -> origin/gh/vkuzo/1/next 2025-12-04T11:11:09.6370332Z * [new branch] gh/vkuzo/2/next -> origin/gh/vkuzo/2/next 2025-12-04T11:11:09.6370404Z * [new branch] gh/vkuzo/3/next -> origin/gh/vkuzo/3/next 2025-12-04T11:11:09.6370483Z * [new branch] gh/wconstab/424/base -> origin/gh/wconstab/424/base 2025-12-04T11:11:09.6370561Z * [new branch] gh/wconstab/424/head -> origin/gh/wconstab/424/head 2025-12-04T11:11:09.6370638Z * [new branch] gh/wconstab/424/orig -> origin/gh/wconstab/424/orig 2025-12-04T11:11:09.6370710Z * [new branch] gh/wconstab/435/base -> origin/gh/wconstab/435/base 2025-12-04T11:11:09.6370783Z * [new branch] gh/wconstab/435/head -> origin/gh/wconstab/435/head 2025-12-04T11:11:09.6370856Z * [new branch] gh/wconstab/435/orig -> origin/gh/wconstab/435/orig 2025-12-04T11:11:09.6370927Z * [new branch] gh/wconstab/444/base -> origin/gh/wconstab/444/base 2025-12-04T11:11:09.6370999Z * [new branch] gh/wconstab/444/head -> origin/gh/wconstab/444/head 2025-12-04T11:11:09.6371074Z * [new branch] gh/wconstab/444/orig -> origin/gh/wconstab/444/orig 2025-12-04T11:11:09.6371147Z * [new branch] gh/wconstab/447/base -> origin/gh/wconstab/447/base 2025-12-04T11:11:09.6371219Z * [new branch] gh/wconstab/447/head -> origin/gh/wconstab/447/head 2025-12-04T11:11:09.6371296Z * [new branch] gh/wconstab/447/orig -> origin/gh/wconstab/447/orig 2025-12-04T11:11:09.6371367Z * [new branch] gh/wconstab/448/base -> origin/gh/wconstab/448/base 2025-12-04T11:11:09.6371438Z * [new branch] gh/wconstab/448/head -> origin/gh/wconstab/448/head 2025-12-04T11:11:09.6371514Z * [new branch] gh/wconstab/448/orig -> origin/gh/wconstab/448/orig 2025-12-04T11:11:09.6371587Z * [new branch] gh/wconstab/449/base -> origin/gh/wconstab/449/base 2025-12-04T11:11:09.6371662Z * [new branch] gh/wconstab/449/head -> origin/gh/wconstab/449/head 2025-12-04T11:11:09.6371734Z * [new branch] gh/wconstab/449/orig -> origin/gh/wconstab/449/orig 2025-12-04T11:11:09.6371829Z * [new branch] gh/wconstab/450/base -> origin/gh/wconstab/450/base 2025-12-04T11:11:09.6371903Z * [new branch] gh/wconstab/450/head -> origin/gh/wconstab/450/head 2025-12-04T11:11:09.6371997Z * [new branch] gh/wconstab/450/orig -> origin/gh/wconstab/450/orig 2025-12-04T11:11:09.6372068Z * [new branch] gh/wconstab/451/base -> origin/gh/wconstab/451/base 2025-12-04T11:11:09.6372144Z * [new branch] gh/wconstab/451/head -> origin/gh/wconstab/451/head 2025-12-04T11:11:09.6372217Z * [new branch] gh/wconstab/451/orig -> origin/gh/wconstab/451/orig 2025-12-04T11:11:09.6372289Z * [new branch] gh/wconstab/452/base -> origin/gh/wconstab/452/base 2025-12-04T11:11:09.6372363Z * [new branch] gh/wconstab/452/head -> origin/gh/wconstab/452/head 2025-12-04T11:11:09.6372435Z * [new branch] gh/wconstab/452/orig -> origin/gh/wconstab/452/orig 2025-12-04T11:11:09.6372509Z * [new branch] gh/wconstab/453/base -> origin/gh/wconstab/453/base 2025-12-04T11:11:09.6372583Z * [new branch] gh/wconstab/453/head -> origin/gh/wconstab/453/head 2025-12-04T11:11:09.6372656Z * [new branch] gh/wconstab/453/orig -> origin/gh/wconstab/453/orig 2025-12-04T11:11:09.6372729Z * [new branch] gh/wconstab/454/base -> origin/gh/wconstab/454/base 2025-12-04T11:11:09.6372807Z * [new branch] gh/wconstab/454/head -> origin/gh/wconstab/454/head 2025-12-04T11:11:09.6372880Z * [new branch] gh/wconstab/454/orig -> origin/gh/wconstab/454/orig 2025-12-04T11:11:09.6372952Z * [new branch] gh/wconstab/455/base -> origin/gh/wconstab/455/base 2025-12-04T11:11:09.6373027Z * [new branch] gh/wconstab/455/head -> origin/gh/wconstab/455/head 2025-12-04T11:11:09.6373100Z * [new branch] gh/wconstab/455/orig -> origin/gh/wconstab/455/orig 2025-12-04T11:11:09.6373172Z * [new branch] gh/wconstab/456/base -> origin/gh/wconstab/456/base 2025-12-04T11:11:09.6373246Z * [new branch] gh/wconstab/456/head -> origin/gh/wconstab/456/head 2025-12-04T11:11:09.6373320Z * [new branch] gh/wconstab/456/orig -> origin/gh/wconstab/456/orig 2025-12-04T11:11:09.6373394Z * [new branch] gh/wconstab/457/base -> origin/gh/wconstab/457/base 2025-12-04T11:11:09.6373467Z * [new branch] gh/wconstab/457/head -> origin/gh/wconstab/457/head 2025-12-04T11:11:09.6373543Z * [new branch] gh/wconstab/457/orig -> origin/gh/wconstab/457/orig 2025-12-04T11:11:09.6373636Z * [new branch] gh/wconstab/458/base -> origin/gh/wconstab/458/base 2025-12-04T11:11:09.6373713Z * [new branch] gh/wconstab/458/head -> origin/gh/wconstab/458/head 2025-12-04T11:11:09.6373787Z * [new branch] gh/wconstab/458/orig -> origin/gh/wconstab/458/orig 2025-12-04T11:11:09.6373861Z * [new branch] gh/wconstab/459/base -> origin/gh/wconstab/459/base 2025-12-04T11:11:09.6373932Z * [new branch] gh/wconstab/459/head -> origin/gh/wconstab/459/head 2025-12-04T11:11:09.6374006Z * [new branch] gh/wconstab/459/orig -> origin/gh/wconstab/459/orig 2025-12-04T11:11:09.6374079Z * [new branch] gh/wconstab/460/base -> origin/gh/wconstab/460/base 2025-12-04T11:11:09.6374150Z * [new branch] gh/wconstab/460/head -> origin/gh/wconstab/460/head 2025-12-04T11:11:09.6374222Z * [new branch] gh/wconstab/460/orig -> origin/gh/wconstab/460/orig 2025-12-04T11:11:09.6374296Z * [new branch] gh/wconstab/461/base -> origin/gh/wconstab/461/base 2025-12-04T11:11:09.6374367Z * [new branch] gh/wconstab/461/head -> origin/gh/wconstab/461/head 2025-12-04T11:11:09.6374458Z * [new branch] gh/wconstab/461/orig -> origin/gh/wconstab/461/orig 2025-12-04T11:11:09.6374533Z * [new branch] gh/wconstab/462/base -> origin/gh/wconstab/462/base 2025-12-04T11:11:09.6374605Z * [new branch] gh/wconstab/462/head -> origin/gh/wconstab/462/head 2025-12-04T11:11:09.6374700Z * [new branch] gh/wconstab/462/orig -> origin/gh/wconstab/462/orig 2025-12-04T11:11:09.6374774Z * [new branch] gh/wconstab/463/base -> origin/gh/wconstab/463/base 2025-12-04T11:11:09.6374846Z * [new branch] gh/wconstab/463/head -> origin/gh/wconstab/463/head 2025-12-04T11:11:09.6374921Z * [new branch] gh/wconstab/463/orig -> origin/gh/wconstab/463/orig 2025-12-04T11:11:09.6374994Z * [new branch] gh/wconstab/464/base -> origin/gh/wconstab/464/base 2025-12-04T11:11:09.6375067Z * [new branch] gh/wconstab/464/head -> origin/gh/wconstab/464/head 2025-12-04T11:11:09.6375145Z * [new branch] gh/wconstab/464/orig -> origin/gh/wconstab/464/orig 2025-12-04T11:11:09.6375216Z * [new branch] gh/wconstab/465/base -> origin/gh/wconstab/465/base 2025-12-04T11:11:09.6375288Z * [new branch] gh/wconstab/465/head -> origin/gh/wconstab/465/head 2025-12-04T11:11:09.6375370Z * [new branch] gh/wconstab/465/orig -> origin/gh/wconstab/465/orig 2025-12-04T11:11:09.6375441Z * [new branch] gh/wconstab/466/base -> origin/gh/wconstab/466/base 2025-12-04T11:11:09.6375513Z * [new branch] gh/wconstab/466/head -> origin/gh/wconstab/466/head 2025-12-04T11:11:09.6375588Z * [new branch] gh/wconstab/466/orig -> origin/gh/wconstab/466/orig 2025-12-04T11:11:09.6375659Z * [new branch] gh/wconstab/467/base -> origin/gh/wconstab/467/base 2025-12-04T11:11:09.6375731Z * [new branch] gh/wconstab/467/head -> origin/gh/wconstab/467/head 2025-12-04T11:11:09.6375806Z * [new branch] gh/wconstab/467/orig -> origin/gh/wconstab/467/orig 2025-12-04T11:11:09.6375879Z * [new branch] gh/wconstab/468/base -> origin/gh/wconstab/468/base 2025-12-04T11:11:09.6375950Z * [new branch] gh/wconstab/468/head -> origin/gh/wconstab/468/head 2025-12-04T11:11:09.6376028Z * [new branch] gh/wconstab/468/orig -> origin/gh/wconstab/468/orig 2025-12-04T11:11:09.6376101Z * [new branch] gh/weifengpy/39/base -> origin/gh/weifengpy/39/base 2025-12-04T11:11:09.6376174Z * [new branch] gh/weifengpy/39/head -> origin/gh/weifengpy/39/head 2025-12-04T11:11:09.6376249Z * [new branch] gh/weifengpy/39/orig -> origin/gh/weifengpy/39/orig 2025-12-04T11:11:09.6376321Z * [new branch] gh/weifengpy/40/base -> origin/gh/weifengpy/40/base 2025-12-04T11:11:09.6376396Z * [new branch] gh/weifengpy/40/head -> origin/gh/weifengpy/40/head 2025-12-04T11:11:09.6376470Z * [new branch] gh/weifengpy/40/orig -> origin/gh/weifengpy/40/orig 2025-12-04T11:11:09.6376543Z * [new branch] gh/weifengpy/41/base -> origin/gh/weifengpy/41/base 2025-12-04T11:11:09.6376618Z * [new branch] gh/weifengpy/41/head -> origin/gh/weifengpy/41/head 2025-12-04T11:11:09.6376692Z * [new branch] gh/weifengpy/41/orig -> origin/gh/weifengpy/41/orig 2025-12-04T11:11:09.6376777Z * [new branch] gh/williamwen42/250/base -> origin/gh/williamwen42/250/base 2025-12-04T11:11:09.6376862Z * [new branch] gh/williamwen42/250/head -> origin/gh/williamwen42/250/head 2025-12-04T11:11:09.6376943Z * [new branch] gh/williamwen42/250/orig -> origin/gh/williamwen42/250/orig 2025-12-04T11:11:09.6377022Z * [new branch] gh/williamwen42/279/base -> origin/gh/williamwen42/279/base 2025-12-04T11:11:09.6377104Z * [new branch] gh/williamwen42/279/head -> origin/gh/williamwen42/279/head 2025-12-04T11:11:09.6377205Z * [new branch] gh/williamwen42/279/orig -> origin/gh/williamwen42/279/orig 2025-12-04T11:11:09.6377284Z * [new branch] gh/williamwen42/282/base -> origin/gh/williamwen42/282/base 2025-12-04T11:11:09.6377385Z * [new branch] gh/williamwen42/282/head -> origin/gh/williamwen42/282/head 2025-12-04T11:11:09.6377464Z * [new branch] gh/williamwen42/282/orig -> origin/gh/williamwen42/282/orig 2025-12-04T11:11:09.6377542Z * [new branch] gh/williamwen42/287/base -> origin/gh/williamwen42/287/base 2025-12-04T11:11:09.6377623Z * [new branch] gh/williamwen42/287/head -> origin/gh/williamwen42/287/head 2025-12-04T11:11:09.6377703Z * [new branch] gh/williamwen42/287/orig -> origin/gh/williamwen42/287/orig 2025-12-04T11:11:09.6377782Z * [new branch] gh/williamwen42/288/base -> origin/gh/williamwen42/288/base 2025-12-04T11:11:09.6377863Z * [new branch] gh/williamwen42/288/head -> origin/gh/williamwen42/288/head 2025-12-04T11:11:09.6377942Z * [new branch] gh/williamwen42/288/orig -> origin/gh/williamwen42/288/orig 2025-12-04T11:11:09.6378023Z * [new branch] gh/williamwen42/296/base -> origin/gh/williamwen42/296/base 2025-12-04T11:11:09.6378104Z * [new branch] gh/williamwen42/296/head -> origin/gh/williamwen42/296/head 2025-12-04T11:11:09.6378215Z * [new branch] gh/williamwen42/296/orig -> origin/gh/williamwen42/296/orig 2025-12-04T11:11:09.6378296Z * [new branch] gh/williamwen42/297/base -> origin/gh/williamwen42/297/base 2025-12-04T11:11:09.6378377Z * [new branch] gh/williamwen42/297/head -> origin/gh/williamwen42/297/head 2025-12-04T11:11:09.6378456Z * [new branch] gh/williamwen42/297/orig -> origin/gh/williamwen42/297/orig 2025-12-04T11:11:09.6378538Z * [new branch] gh/williamwen42/306/base -> origin/gh/williamwen42/306/base 2025-12-04T11:11:09.6378618Z * [new branch] gh/williamwen42/306/head -> origin/gh/williamwen42/306/head 2025-12-04T11:11:09.6378697Z * [new branch] gh/williamwen42/306/orig -> origin/gh/williamwen42/306/orig 2025-12-04T11:11:09.6378779Z * [new branch] gh/williamwen42/309/base -> origin/gh/williamwen42/309/base 2025-12-04T11:11:09.6378857Z * [new branch] gh/williamwen42/309/head -> origin/gh/williamwen42/309/head 2025-12-04T11:11:09.6378936Z * [new branch] gh/williamwen42/309/orig -> origin/gh/williamwen42/309/orig 2025-12-04T11:11:09.6379017Z * [new branch] gh/williamwen42/310/base -> origin/gh/williamwen42/310/base 2025-12-04T11:11:09.6379095Z * [new branch] gh/williamwen42/310/head -> origin/gh/williamwen42/310/head 2025-12-04T11:11:09.6379174Z * [new branch] gh/williamwen42/310/orig -> origin/gh/williamwen42/310/orig 2025-12-04T11:11:09.6379256Z * [new branch] gh/williamwen42/311/base -> origin/gh/williamwen42/311/base 2025-12-04T11:11:09.6379335Z * [new branch] gh/williamwen42/311/head -> origin/gh/williamwen42/311/head 2025-12-04T11:11:09.6379415Z * [new branch] gh/williamwen42/311/orig -> origin/gh/williamwen42/311/orig 2025-12-04T11:11:09.6379496Z * [new branch] gh/williamwen42/319/base -> origin/gh/williamwen42/319/base 2025-12-04T11:11:09.6379574Z * [new branch] gh/williamwen42/319/head -> origin/gh/williamwen42/319/head 2025-12-04T11:11:09.6379654Z * [new branch] gh/williamwen42/319/orig -> origin/gh/williamwen42/319/orig 2025-12-04T11:11:09.6379733Z * [new branch] gh/williamwen42/325/base -> origin/gh/williamwen42/325/base 2025-12-04T11:11:09.6379813Z * [new branch] gh/williamwen42/325/head -> origin/gh/williamwen42/325/head 2025-12-04T11:11:09.6379896Z * [new branch] gh/williamwen42/325/orig -> origin/gh/williamwen42/325/orig 2025-12-04T11:11:09.6380008Z * [new branch] gh/williamwen42/326/base -> origin/gh/williamwen42/326/base 2025-12-04T11:11:09.6380087Z * [new branch] gh/williamwen42/326/head -> origin/gh/williamwen42/326/head 2025-12-04T11:11:09.6380203Z * [new branch] gh/williamwen42/326/orig -> origin/gh/williamwen42/326/orig 2025-12-04T11:11:09.6380281Z * [new branch] gh/williamwen42/327/base -> origin/gh/williamwen42/327/base 2025-12-04T11:11:09.6380363Z * [new branch] gh/williamwen42/327/head -> origin/gh/williamwen42/327/head 2025-12-04T11:11:09.6380444Z * [new branch] gh/williamwen42/327/orig -> origin/gh/williamwen42/327/orig 2025-12-04T11:11:09.6380523Z * [new branch] gh/williamwen42/328/base -> origin/gh/williamwen42/328/base 2025-12-04T11:11:09.6380602Z * [new branch] gh/williamwen42/328/head -> origin/gh/williamwen42/328/head 2025-12-04T11:11:09.6380685Z * [new branch] gh/williamwen42/328/orig -> origin/gh/williamwen42/328/orig 2025-12-04T11:11:09.6380764Z * [new branch] gh/williamwen42/329/base -> origin/gh/williamwen42/329/base 2025-12-04T11:11:09.6380842Z * [new branch] gh/williamwen42/329/head -> origin/gh/williamwen42/329/head 2025-12-04T11:11:09.6380924Z * [new branch] gh/williamwen42/329/orig -> origin/gh/williamwen42/329/orig 2025-12-04T11:11:09.6381003Z * [new branch] gh/williamwen42/330/base -> origin/gh/williamwen42/330/base 2025-12-04T11:11:09.6381084Z * [new branch] gh/williamwen42/330/head -> origin/gh/williamwen42/330/head 2025-12-04T11:11:09.6381162Z * [new branch] gh/williamwen42/330/orig -> origin/gh/williamwen42/330/orig 2025-12-04T11:11:09.6381239Z * [new branch] gh/williamwen42/331/base -> origin/gh/williamwen42/331/base 2025-12-04T11:11:09.6381320Z * [new branch] gh/williamwen42/331/head -> origin/gh/williamwen42/331/head 2025-12-04T11:11:09.6381399Z * [new branch] gh/williamwen42/331/orig -> origin/gh/williamwen42/331/orig 2025-12-04T11:11:09.6381479Z * [new branch] gh/williamwen42/332/base -> origin/gh/williamwen42/332/base 2025-12-04T11:11:09.6381564Z * [new branch] gh/williamwen42/332/head -> origin/gh/williamwen42/332/head 2025-12-04T11:11:09.6381642Z * [new branch] gh/williamwen42/332/orig -> origin/gh/williamwen42/332/orig 2025-12-04T11:11:09.6381721Z * [new branch] gh/williamwen42/333/base -> origin/gh/williamwen42/333/base 2025-12-04T11:11:09.6381801Z * [new branch] gh/williamwen42/333/head -> origin/gh/williamwen42/333/head 2025-12-04T11:11:09.6381880Z * [new branch] gh/williamwen42/333/orig -> origin/gh/williamwen42/333/orig 2025-12-04T11:11:09.6381959Z * [new branch] gh/williamwen42/334/base -> origin/gh/williamwen42/334/base 2025-12-04T11:11:09.6382042Z * [new branch] gh/williamwen42/334/head -> origin/gh/williamwen42/334/head 2025-12-04T11:11:09.6382121Z * [new branch] gh/williamwen42/334/orig -> origin/gh/williamwen42/334/orig 2025-12-04T11:11:09.6382200Z * [new branch] gh/williamwen42/335/base -> origin/gh/williamwen42/335/base 2025-12-04T11:11:09.6382283Z * [new branch] gh/williamwen42/335/head -> origin/gh/williamwen42/335/head 2025-12-04T11:11:09.6382362Z * [new branch] gh/williamwen42/335/orig -> origin/gh/williamwen42/335/orig 2025-12-04T11:11:09.6382440Z * [new branch] gh/williamwen42/336/base -> origin/gh/williamwen42/336/base 2025-12-04T11:11:09.6382521Z * [new branch] gh/williamwen42/336/head -> origin/gh/williamwen42/336/head 2025-12-04T11:11:09.6382602Z * [new branch] gh/williamwen42/336/orig -> origin/gh/williamwen42/336/orig 2025-12-04T11:11:09.6382682Z * [new branch] gh/williamwen42/337/base -> origin/gh/williamwen42/337/base 2025-12-04T11:11:09.6382783Z * [new branch] gh/williamwen42/337/head -> origin/gh/williamwen42/337/head 2025-12-04T11:11:09.6382862Z * [new branch] gh/williamwen42/337/orig -> origin/gh/williamwen42/337/orig 2025-12-04T11:11:09.6382966Z * [new branch] gh/williamwen42/338/base -> origin/gh/williamwen42/338/base 2025-12-04T11:11:09.6383047Z * [new branch] gh/williamwen42/338/head -> origin/gh/williamwen42/338/head 2025-12-04T11:11:09.6383126Z * [new branch] gh/williamwen42/338/orig -> origin/gh/williamwen42/338/orig 2025-12-04T11:11:09.6383209Z * [new branch] gh/williamwen42/339/base -> origin/gh/williamwen42/339/base 2025-12-04T11:11:09.6383289Z * [new branch] gh/williamwen42/339/head -> origin/gh/williamwen42/339/head 2025-12-04T11:11:09.6383369Z * [new branch] gh/williamwen42/339/orig -> origin/gh/williamwen42/339/orig 2025-12-04T11:11:09.6383451Z * [new branch] gh/williamwen42/340/base -> origin/gh/williamwen42/340/base 2025-12-04T11:11:09.6383532Z * [new branch] gh/williamwen42/340/head -> origin/gh/williamwen42/340/head 2025-12-04T11:11:09.6383613Z * [new branch] gh/williamwen42/340/orig -> origin/gh/williamwen42/340/orig 2025-12-04T11:11:09.6383696Z * [new branch] gh/williamwen42/341/base -> origin/gh/williamwen42/341/base 2025-12-04T11:11:09.6383775Z * [new branch] gh/williamwen42/341/head -> origin/gh/williamwen42/341/head 2025-12-04T11:11:09.6383853Z * [new branch] gh/williamwen42/341/orig -> origin/gh/williamwen42/341/orig 2025-12-04T11:11:09.6383934Z * [new branch] gh/williamwen42/342/base -> origin/gh/williamwen42/342/base 2025-12-04T11:11:09.6384015Z * [new branch] gh/williamwen42/342/head -> origin/gh/williamwen42/342/head 2025-12-04T11:11:09.6384101Z * [new branch] gh/williamwen42/342/orig -> origin/gh/williamwen42/342/orig 2025-12-04T11:11:09.6384182Z * [new branch] gh/williamwen42/343/base -> origin/gh/williamwen42/343/base 2025-12-04T11:11:09.6384264Z * [new branch] gh/williamwen42/343/head -> origin/gh/williamwen42/343/head 2025-12-04T11:11:09.6384349Z * [new branch] gh/williamwen42/343/orig -> origin/gh/williamwen42/343/orig 2025-12-04T11:11:09.6384430Z * [new branch] gh/williamwen42/344/base -> origin/gh/williamwen42/344/base 2025-12-04T11:11:09.6384514Z * [new branch] gh/williamwen42/344/head -> origin/gh/williamwen42/344/head 2025-12-04T11:11:09.6384599Z * [new branch] gh/williamwen42/344/orig -> origin/gh/williamwen42/344/orig 2025-12-04T11:11:09.6384680Z * [new branch] gh/williamwen42/345/base -> origin/gh/williamwen42/345/base 2025-12-04T11:11:09.6384759Z * [new branch] gh/williamwen42/345/head -> origin/gh/williamwen42/345/head 2025-12-04T11:11:09.6384841Z * [new branch] gh/williamwen42/345/orig -> origin/gh/williamwen42/345/orig 2025-12-04T11:11:09.6384924Z * [new branch] gh/williamwen42/346/base -> origin/gh/williamwen42/346/base 2025-12-04T11:11:09.6385005Z * [new branch] gh/williamwen42/346/head -> origin/gh/williamwen42/346/head 2025-12-04T11:11:09.6385091Z * [new branch] gh/williamwen42/346/orig -> origin/gh/williamwen42/346/orig 2025-12-04T11:11:09.6385173Z * [new branch] gh/williamwen42/347/base -> origin/gh/williamwen42/347/base 2025-12-04T11:11:09.6385254Z * [new branch] gh/williamwen42/347/head -> origin/gh/williamwen42/347/head 2025-12-04T11:11:09.6385336Z * [new branch] gh/williamwen42/347/orig -> origin/gh/williamwen42/347/orig 2025-12-04T11:11:09.6385415Z * [new branch] gh/williamwen42/348/base -> origin/gh/williamwen42/348/base 2025-12-04T11:11:09.6385494Z * [new branch] gh/williamwen42/348/head -> origin/gh/williamwen42/348/head 2025-12-04T11:11:09.6385601Z * [new branch] gh/williamwen42/348/orig -> origin/gh/williamwen42/348/orig 2025-12-04T11:11:09.6385684Z * [new branch] gh/williamwen42/349/base -> origin/gh/williamwen42/349/base 2025-12-04T11:11:09.6385768Z * [new branch] gh/williamwen42/349/head -> origin/gh/williamwen42/349/head 2025-12-04T11:11:09.6385875Z * [new branch] gh/williamwen42/349/orig -> origin/gh/williamwen42/349/orig 2025-12-04T11:11:09.6385954Z * [new branch] gh/williamwen42/350/base -> origin/gh/williamwen42/350/base 2025-12-04T11:11:09.6386040Z * [new branch] gh/williamwen42/350/head -> origin/gh/williamwen42/350/head 2025-12-04T11:11:09.6386125Z * [new branch] gh/williamwen42/350/orig -> origin/gh/williamwen42/350/orig 2025-12-04T11:11:09.6386205Z * [new branch] gh/williamwen42/351/base -> origin/gh/williamwen42/351/base 2025-12-04T11:11:09.6386286Z * [new branch] gh/williamwen42/351/head -> origin/gh/williamwen42/351/head 2025-12-04T11:11:09.6386368Z * [new branch] gh/williamwen42/351/orig -> origin/gh/williamwen42/351/orig 2025-12-04T11:11:09.6386449Z * [new branch] gh/williamwen42/352/base -> origin/gh/williamwen42/352/base 2025-12-04T11:11:09.6386533Z * [new branch] gh/williamwen42/352/head -> origin/gh/williamwen42/352/head 2025-12-04T11:11:09.6386615Z * [new branch] gh/williamwen42/352/orig -> origin/gh/williamwen42/352/orig 2025-12-04T11:11:09.6386696Z * [new branch] gh/williamwen42/353/base -> origin/gh/williamwen42/353/base 2025-12-04T11:11:09.6386778Z * [new branch] gh/williamwen42/353/head -> origin/gh/williamwen42/353/head 2025-12-04T11:11:09.6386858Z * [new branch] gh/williamwen42/353/orig -> origin/gh/williamwen42/353/orig 2025-12-04T11:11:09.6386937Z * [new branch] gh/williamwen42/354/base -> origin/gh/williamwen42/354/base 2025-12-04T11:11:09.6387020Z * [new branch] gh/williamwen42/354/head -> origin/gh/williamwen42/354/head 2025-12-04T11:11:09.6387099Z * [new branch] gh/williamwen42/354/orig -> origin/gh/williamwen42/354/orig 2025-12-04T11:11:09.6387181Z * [new branch] gh/williamwen42/355/base -> origin/gh/williamwen42/355/base 2025-12-04T11:11:09.6387262Z * [new branch] gh/williamwen42/355/head -> origin/gh/williamwen42/355/head 2025-12-04T11:11:09.6387341Z * [new branch] gh/williamwen42/355/orig -> origin/gh/williamwen42/355/orig 2025-12-04T11:11:09.6387426Z * [new branch] gh/williamwen42/356/base -> origin/gh/williamwen42/356/base 2025-12-04T11:11:09.6387504Z * [new branch] gh/williamwen42/356/head -> origin/gh/williamwen42/356/head 2025-12-04T11:11:09.6387584Z * [new branch] gh/williamwen42/356/orig -> origin/gh/williamwen42/356/orig 2025-12-04T11:11:09.6387670Z * [new branch] gh/williamwen42/357/base -> origin/gh/williamwen42/357/base 2025-12-04T11:11:09.6387752Z * [new branch] gh/williamwen42/357/head -> origin/gh/williamwen42/357/head 2025-12-04T11:11:09.6387832Z * [new branch] gh/williamwen42/357/orig -> origin/gh/williamwen42/357/orig 2025-12-04T11:11:09.6387916Z * [new branch] gh/williamwen42/358/base -> origin/gh/williamwen42/358/base 2025-12-04T11:11:09.6387997Z * [new branch] gh/williamwen42/358/head -> origin/gh/williamwen42/358/head 2025-12-04T11:11:09.6388077Z * [new branch] gh/williamwen42/358/orig -> origin/gh/williamwen42/358/orig 2025-12-04T11:11:09.6388193Z * [new branch] gh/xmfan/169/base -> origin/gh/xmfan/169/base 2025-12-04T11:11:09.6388266Z * [new branch] gh/xmfan/169/head -> origin/gh/xmfan/169/head 2025-12-04T11:11:09.6388336Z * [new branch] gh/xmfan/170/base -> origin/gh/xmfan/170/base 2025-12-04T11:11:09.6388441Z * [new branch] gh/xmfan/170/head -> origin/gh/xmfan/170/head 2025-12-04T11:11:09.6388510Z * [new branch] gh/xmfan/274/base -> origin/gh/xmfan/274/base 2025-12-04T11:11:09.6388577Z * [new branch] gh/xmfan/274/head -> origin/gh/xmfan/274/head 2025-12-04T11:11:09.6388674Z * [new branch] gh/xmfan/274/orig -> origin/gh/xmfan/274/orig 2025-12-04T11:11:09.6388742Z * [new branch] gh/xmfan/277/base -> origin/gh/xmfan/277/base 2025-12-04T11:11:09.6388812Z * [new branch] gh/xmfan/277/head -> origin/gh/xmfan/277/head 2025-12-04T11:11:09.6388879Z * [new branch] gh/xmfan/277/orig -> origin/gh/xmfan/277/orig 2025-12-04T11:11:09.6388947Z * [new branch] gh/xmfan/301/base -> origin/gh/xmfan/301/base 2025-12-04T11:11:09.6389020Z * [new branch] gh/xmfan/301/head -> origin/gh/xmfan/301/head 2025-12-04T11:11:09.6389088Z * [new branch] gh/xmfan/301/orig -> origin/gh/xmfan/301/orig 2025-12-04T11:11:09.6389161Z * [new branch] gh/xmfan/304/base -> origin/gh/xmfan/304/base 2025-12-04T11:11:09.6389237Z * [new branch] gh/xmfan/304/head -> origin/gh/xmfan/304/head 2025-12-04T11:11:09.6389310Z * [new branch] gh/xmfan/304/orig -> origin/gh/xmfan/304/orig 2025-12-04T11:11:09.6389377Z * [new branch] gh/xmfan/309/base -> origin/gh/xmfan/309/base 2025-12-04T11:11:09.6389447Z * [new branch] gh/xmfan/309/head -> origin/gh/xmfan/309/head 2025-12-04T11:11:09.6389514Z * [new branch] gh/xmfan/309/orig -> origin/gh/xmfan/309/orig 2025-12-04T11:11:09.6389581Z * [new branch] gh/xmfan/310/base -> origin/gh/xmfan/310/base 2025-12-04T11:11:09.6389651Z * [new branch] gh/xmfan/310/head -> origin/gh/xmfan/310/head 2025-12-04T11:11:09.6389719Z * [new branch] gh/xmfan/310/orig -> origin/gh/xmfan/310/orig 2025-12-04T11:11:09.6389789Z * [new branch] gh/xmfan/311/base -> origin/gh/xmfan/311/base 2025-12-04T11:11:09.6389861Z * [new branch] gh/xmfan/311/head -> origin/gh/xmfan/311/head 2025-12-04T11:11:09.6389931Z * [new branch] gh/xmfan/311/orig -> origin/gh/xmfan/311/orig 2025-12-04T11:11:09.6389998Z * [new branch] gh/xmfan/312/base -> origin/gh/xmfan/312/base 2025-12-04T11:11:09.6390069Z * [new branch] gh/xmfan/312/head -> origin/gh/xmfan/312/head 2025-12-04T11:11:09.6390136Z * [new branch] gh/xmfan/312/orig -> origin/gh/xmfan/312/orig 2025-12-04T11:11:09.6390205Z * [new branch] gh/xmfan/313/base -> origin/gh/xmfan/313/base 2025-12-04T11:11:09.6390275Z * [new branch] gh/xmfan/313/head -> origin/gh/xmfan/313/head 2025-12-04T11:11:09.6390343Z * [new branch] gh/xmfan/313/orig -> origin/gh/xmfan/313/orig 2025-12-04T11:11:09.6390424Z * [new branch] gh/xuanzhang816/27/base -> origin/gh/xuanzhang816/27/base 2025-12-04T11:11:09.6390508Z * [new branch] gh/xuanzhang816/27/head -> origin/gh/xuanzhang816/27/head 2025-12-04T11:11:09.6390591Z * [new branch] gh/xuanzhang816/27/orig -> origin/gh/xuanzhang816/27/orig 2025-12-04T11:11:09.6390673Z * [new branch] gh/xuanzhang816/32/base -> origin/gh/xuanzhang816/32/base 2025-12-04T11:11:09.6390751Z * [new branch] gh/xuanzhang816/32/head -> origin/gh/xuanzhang816/32/head 2025-12-04T11:11:09.6390829Z * [new branch] gh/xuanzhang816/32/orig -> origin/gh/xuanzhang816/32/orig 2025-12-04T11:11:09.6390909Z * [new branch] gh/xuanzhang816/33/base -> origin/gh/xuanzhang816/33/base 2025-12-04T11:11:09.6390988Z * [new branch] gh/xuanzhang816/33/head -> origin/gh/xuanzhang816/33/head 2025-12-04T11:11:09.6391065Z * [new branch] gh/xuanzhang816/33/orig -> origin/gh/xuanzhang816/33/orig 2025-12-04T11:11:09.6391174Z * [new branch] gh/xuanzhang816/34/base -> origin/gh/xuanzhang816/34/base 2025-12-04T11:11:09.6391250Z * [new branch] gh/xuanzhang816/34/head -> origin/gh/xuanzhang816/34/head 2025-12-04T11:11:09.6391349Z * [new branch] gh/xuanzhang816/34/orig -> origin/gh/xuanzhang816/34/orig 2025-12-04T11:11:09.6391428Z * [new branch] gh/xuanzhang816/35/base -> origin/gh/xuanzhang816/35/base 2025-12-04T11:11:09.6391507Z * [new branch] gh/xuanzhang816/35/head -> origin/gh/xuanzhang816/35/head 2025-12-04T11:11:09.6391584Z * [new branch] gh/xuanzhang816/35/orig -> origin/gh/xuanzhang816/35/orig 2025-12-04T11:11:09.6391662Z * [new branch] gh/yanbing-j/11/base -> origin/gh/yanbing-j/11/base 2025-12-04T11:11:09.6391736Z * [new branch] gh/yanbing-j/11/head -> origin/gh/yanbing-j/11/head 2025-12-04T11:11:09.6391812Z * [new branch] gh/yanbing-j/11/orig -> origin/gh/yanbing-j/11/orig 2025-12-04T11:11:09.6391887Z * [new branch] gh/yanbing-j/12/base -> origin/gh/yanbing-j/12/base 2025-12-04T11:11:09.6391959Z * [new branch] gh/yanbing-j/12/head -> origin/gh/yanbing-j/12/head 2025-12-04T11:11:09.6392033Z * [new branch] gh/yanbing-j/12/orig -> origin/gh/yanbing-j/12/orig 2025-12-04T11:11:09.6392110Z * [new branch] gh/yanbing-j/13/base -> origin/gh/yanbing-j/13/base 2025-12-04T11:11:09.6392185Z * [new branch] gh/yanbing-j/13/head -> origin/gh/yanbing-j/13/head 2025-12-04T11:11:09.6392260Z * [new branch] gh/yanbing-j/13/orig -> origin/gh/yanbing-j/13/orig 2025-12-04T11:11:09.6392334Z * [new branch] gh/yanbing-j/14/base -> origin/gh/yanbing-j/14/base 2025-12-04T11:11:09.6392408Z * [new branch] gh/yanbing-j/14/head -> origin/gh/yanbing-j/14/head 2025-12-04T11:11:09.6392485Z * [new branch] gh/yanbing-j/14/orig -> origin/gh/yanbing-j/14/orig 2025-12-04T11:11:09.6392556Z * [new branch] gh/yanbing-j/15/base -> origin/gh/yanbing-j/15/base 2025-12-04T11:11:09.6392627Z * [new branch] gh/yanbing-j/15/head -> origin/gh/yanbing-j/15/head 2025-12-04T11:11:09.6392704Z * [new branch] gh/yanbing-j/15/orig -> origin/gh/yanbing-j/15/orig 2025-12-04T11:11:09.6392776Z * [new branch] gh/yanbing-j/18/base -> origin/gh/yanbing-j/18/base 2025-12-04T11:11:09.6392849Z * [new branch] gh/yanbing-j/18/head -> origin/gh/yanbing-j/18/head 2025-12-04T11:11:09.6392923Z * [new branch] gh/yanbing-j/18/orig -> origin/gh/yanbing-j/18/orig 2025-12-04T11:11:09.6392994Z * [new branch] gh/yanbing-j/19/base -> origin/gh/yanbing-j/19/base 2025-12-04T11:11:09.6393065Z * [new branch] gh/yanbing-j/19/head -> origin/gh/yanbing-j/19/head 2025-12-04T11:11:09.6393143Z * [new branch] gh/yanbing-j/19/orig -> origin/gh/yanbing-j/19/orig 2025-12-04T11:11:09.6393217Z * [new branch] gh/yanbing-j/20/base -> origin/gh/yanbing-j/20/base 2025-12-04T11:11:09.6393290Z * [new branch] gh/yanbing-j/20/head -> origin/gh/yanbing-j/20/head 2025-12-04T11:11:09.6393365Z * [new branch] gh/yanbing-j/20/orig -> origin/gh/yanbing-j/20/orig 2025-12-04T11:11:09.6393436Z * [new branch] gh/yanbing-j/21/base -> origin/gh/yanbing-j/21/base 2025-12-04T11:11:09.6393507Z * [new branch] gh/yanbing-j/21/head -> origin/gh/yanbing-j/21/head 2025-12-04T11:11:09.6393580Z * [new branch] gh/yanbing-j/22/base -> origin/gh/yanbing-j/22/base 2025-12-04T11:11:09.6393651Z * [new branch] gh/yanbing-j/22/head -> origin/gh/yanbing-j/22/head 2025-12-04T11:11:09.6393724Z * [new branch] gh/yanbing-j/22/orig -> origin/gh/yanbing-j/22/orig 2025-12-04T11:11:09.6393816Z * [new branch] gh/yanbing-j/23/base -> origin/gh/yanbing-j/23/base 2025-12-04T11:11:09.6393887Z * [new branch] gh/yanbing-j/23/head -> origin/gh/yanbing-j/23/head 2025-12-04T11:11:09.6393962Z * [new branch] gh/yanbing-j/23/orig -> origin/gh/yanbing-j/23/orig 2025-12-04T11:11:09.6394082Z * [new branch] gh/yanbing-j/24/base -> origin/gh/yanbing-j/24/base 2025-12-04T11:11:09.6394152Z * [new branch] gh/yanbing-j/24/head -> origin/gh/yanbing-j/24/head 2025-12-04T11:11:09.6394226Z * [new branch] gh/yanbing-j/24/orig -> origin/gh/yanbing-j/24/orig 2025-12-04T11:11:09.6394297Z * [new branch] gh/yanbing-j/25/base -> origin/gh/yanbing-j/25/base 2025-12-04T11:11:09.6394369Z * [new branch] gh/yanbing-j/25/head -> origin/gh/yanbing-j/25/head 2025-12-04T11:11:09.6394444Z * [new branch] gh/yanbing-j/25/orig -> origin/gh/yanbing-j/25/orig 2025-12-04T11:11:09.6394518Z * [new branch] gh/yanbing-j/26/base -> origin/gh/yanbing-j/26/base 2025-12-04T11:11:09.6394589Z * [new branch] gh/yanbing-j/26/head -> origin/gh/yanbing-j/26/head 2025-12-04T11:11:09.6394665Z * [new branch] gh/yanbing-j/26/orig -> origin/gh/yanbing-j/26/orig 2025-12-04T11:11:09.6394747Z * [new branch] gh/yang-yu-hang/1/base -> origin/gh/yang-yu-hang/1/base 2025-12-04T11:11:09.6394825Z * [new branch] gh/yang-yu-hang/1/head -> origin/gh/yang-yu-hang/1/head 2025-12-04T11:11:09.6394906Z * [new branch] gh/yang-yu-hang/1/orig -> origin/gh/yang-yu-hang/1/orig 2025-12-04T11:11:09.6394982Z * [new branch] gh/yang-yu-hang/2/base -> origin/gh/yang-yu-hang/2/base 2025-12-04T11:11:09.6395058Z * [new branch] gh/yang-yu-hang/2/head -> origin/gh/yang-yu-hang/2/head 2025-12-04T11:11:09.6395136Z * [new branch] gh/yang-yu-hang/2/orig -> origin/gh/yang-yu-hang/2/orig 2025-12-04T11:11:09.6395211Z * [new branch] gh/yang-yu-hang/3/base -> origin/gh/yang-yu-hang/3/base 2025-12-04T11:11:09.6395290Z * [new branch] gh/yang-yu-hang/3/head -> origin/gh/yang-yu-hang/3/head 2025-12-04T11:11:09.6395367Z * [new branch] gh/yang-yu-hang/3/orig -> origin/gh/yang-yu-hang/3/orig 2025-12-04T11:11:09.6395443Z * [new branch] gh/yangw-dev/12/base -> origin/gh/yangw-dev/12/base 2025-12-04T11:11:09.6395522Z * [new branch] gh/yangw-dev/12/head -> origin/gh/yangw-dev/12/head 2025-12-04T11:11:09.6395597Z * [new branch] gh/yangw-dev/12/orig -> origin/gh/yangw-dev/12/orig 2025-12-04T11:11:09.6395672Z * [new branch] gh/yangw-dev/13/base -> origin/gh/yangw-dev/13/base 2025-12-04T11:11:09.6395748Z * [new branch] gh/yangw-dev/13/head -> origin/gh/yangw-dev/13/head 2025-12-04T11:11:09.6395821Z * [new branch] gh/yangw-dev/13/orig -> origin/gh/yangw-dev/13/orig 2025-12-04T11:11:09.6395896Z * [new branch] gh/yangw-dev/14/base -> origin/gh/yangw-dev/14/base 2025-12-04T11:11:09.6395969Z * [new branch] gh/yangw-dev/14/head -> origin/gh/yangw-dev/14/head 2025-12-04T11:11:09.6396042Z * [new branch] gh/yangw-dev/14/orig -> origin/gh/yangw-dev/14/orig 2025-12-04T11:11:09.6396114Z * [new branch] gh/yangw-dev/15/base -> origin/gh/yangw-dev/15/base 2025-12-04T11:11:09.6396188Z * [new branch] gh/yangw-dev/15/head -> origin/gh/yangw-dev/15/head 2025-12-04T11:11:09.6396260Z * [new branch] gh/yangw-dev/15/orig -> origin/gh/yangw-dev/15/orig 2025-12-04T11:11:09.6396332Z * [new branch] gh/yangw-dev/19/base -> origin/gh/yangw-dev/19/base 2025-12-04T11:11:09.6396406Z * [new branch] gh/yangw-dev/19/head -> origin/gh/yangw-dev/19/head 2025-12-04T11:11:09.6396477Z * [new branch] gh/yangw-dev/19/orig -> origin/gh/yangw-dev/19/orig 2025-12-04T11:11:09.6396578Z * [new branch] gh/yangw-dev/26/base -> origin/gh/yangw-dev/26/base 2025-12-04T11:11:09.6396654Z * [new branch] gh/yangw-dev/26/head -> origin/gh/yangw-dev/26/head 2025-12-04T11:11:09.6396745Z * [new branch] gh/yangw-dev/26/orig -> origin/gh/yangw-dev/26/orig 2025-12-04T11:11:09.6396818Z * [new branch] gh/yangw-dev/27/base -> origin/gh/yangw-dev/27/base 2025-12-04T11:11:09.6396895Z * [new branch] gh/yangw-dev/27/head -> origin/gh/yangw-dev/27/head 2025-12-04T11:11:09.6396968Z * [new branch] gh/yangw-dev/27/orig -> origin/gh/yangw-dev/27/orig 2025-12-04T11:11:09.6397043Z * [new branch] gh/ydwu4/292/base -> origin/gh/ydwu4/292/base 2025-12-04T11:11:09.6397113Z * [new branch] gh/ydwu4/292/head -> origin/gh/ydwu4/292/head 2025-12-04T11:11:09.6397184Z * [new branch] gh/ydwu4/292/orig -> origin/gh/ydwu4/292/orig 2025-12-04T11:11:09.6397253Z * [new branch] gh/ydwu4/294/base -> origin/gh/ydwu4/294/base 2025-12-04T11:11:09.6397320Z * [new branch] gh/ydwu4/294/head -> origin/gh/ydwu4/294/head 2025-12-04T11:11:09.6397389Z * [new branch] gh/ydwu4/294/orig -> origin/gh/ydwu4/294/orig 2025-12-04T11:11:09.6397459Z * [new branch] gh/ydwu4/295/base -> origin/gh/ydwu4/295/base 2025-12-04T11:11:09.6397526Z * [new branch] gh/ydwu4/295/head -> origin/gh/ydwu4/295/head 2025-12-04T11:11:09.6397594Z * [new branch] gh/ydwu4/295/orig -> origin/gh/ydwu4/295/orig 2025-12-04T11:11:09.6397667Z * [new branch] gh/ydwu4/296/base -> origin/gh/ydwu4/296/base 2025-12-04T11:11:09.6397734Z * [new branch] gh/ydwu4/296/head -> origin/gh/ydwu4/296/head 2025-12-04T11:11:09.6397801Z * [new branch] gh/ydwu4/296/orig -> origin/gh/ydwu4/296/orig 2025-12-04T11:11:09.6397871Z * [new branch] gh/ydwu4/306/base -> origin/gh/ydwu4/306/base 2025-12-04T11:11:09.6397939Z * [new branch] gh/ydwu4/306/head -> origin/gh/ydwu4/306/head 2025-12-04T11:11:09.6398007Z * [new branch] gh/ydwu4/306/orig -> origin/gh/ydwu4/306/orig 2025-12-04T11:11:09.6398077Z * [new branch] gh/ydwu4/312/base -> origin/gh/ydwu4/312/base 2025-12-04T11:11:09.6398177Z * [new branch] gh/ydwu4/312/head -> origin/gh/ydwu4/312/head 2025-12-04T11:11:09.6398246Z * [new branch] gh/ydwu4/312/orig -> origin/gh/ydwu4/312/orig 2025-12-04T11:11:09.6398319Z * [new branch] gh/ydwu4/322/base -> origin/gh/ydwu4/322/base 2025-12-04T11:11:09.6398390Z * [new branch] gh/ydwu4/322/head -> origin/gh/ydwu4/322/head 2025-12-04T11:11:09.6398459Z * [new branch] gh/ydwu4/322/orig -> origin/gh/ydwu4/322/orig 2025-12-04T11:11:09.6398533Z * [new branch] gh/ydwu4/327/base -> origin/gh/ydwu4/327/base 2025-12-04T11:11:09.6398603Z * [new branch] gh/ydwu4/327/head -> origin/gh/ydwu4/327/head 2025-12-04T11:11:09.6398677Z * [new branch] gh/ydwu4/327/orig -> origin/gh/ydwu4/327/orig 2025-12-04T11:11:09.6398747Z * [new branch] gh/ydwu4/328/base -> origin/gh/ydwu4/328/base 2025-12-04T11:11:09.6398816Z * [new branch] gh/ydwu4/328/head -> origin/gh/ydwu4/328/head 2025-12-04T11:11:09.6398890Z * [new branch] gh/ydwu4/328/orig -> origin/gh/ydwu4/328/orig 2025-12-04T11:11:09.6398960Z * [new branch] gh/ydwu4/329/base -> origin/gh/ydwu4/329/base 2025-12-04T11:11:09.6399028Z * [new branch] gh/ydwu4/329/head -> origin/gh/ydwu4/329/head 2025-12-04T11:11:09.6399101Z * [new branch] gh/ydwu4/329/orig -> origin/gh/ydwu4/329/orig 2025-12-04T11:11:09.6399203Z * [new branch] gh/ydwu4/330/base -> origin/gh/ydwu4/330/base 2025-12-04T11:11:09.6399273Z * [new branch] gh/ydwu4/330/head -> origin/gh/ydwu4/330/head 2025-12-04T11:11:09.6399374Z * [new branch] gh/ydwu4/330/orig -> origin/gh/ydwu4/330/orig 2025-12-04T11:11:09.6399443Z * [new branch] gh/ydwu4/331/base -> origin/gh/ydwu4/331/base 2025-12-04T11:11:09.6399512Z * [new branch] gh/ydwu4/331/head -> origin/gh/ydwu4/331/head 2025-12-04T11:11:09.6399586Z * [new branch] gh/ydwu4/331/orig -> origin/gh/ydwu4/331/orig 2025-12-04T11:11:09.6409244Z * [new branch] gh/ydwu4/332/base -> origin/gh/ydwu4/332/base 2025-12-04T11:11:09.6409325Z * [new branch] gh/ydwu4/332/head -> origin/gh/ydwu4/332/head 2025-12-04T11:11:09.6409397Z * [new branch] gh/ydwu4/332/orig -> origin/gh/ydwu4/332/orig 2025-12-04T11:11:09.6409474Z * [new branch] gh/ydwu4/333/base -> origin/gh/ydwu4/333/base 2025-12-04T11:11:09.6409541Z * [new branch] gh/ydwu4/333/head -> origin/gh/ydwu4/333/head 2025-12-04T11:11:09.6409616Z * [new branch] gh/ydwu4/333/orig -> origin/gh/ydwu4/333/orig 2025-12-04T11:11:09.6409694Z * [new branch] gh/ydwu4/334/base -> origin/gh/ydwu4/334/base 2025-12-04T11:11:09.6409765Z * [new branch] gh/ydwu4/334/head -> origin/gh/ydwu4/334/head 2025-12-04T11:11:09.6409833Z * [new branch] gh/ydwu4/334/orig -> origin/gh/ydwu4/334/orig 2025-12-04T11:11:09.6409900Z * [new branch] gh/ydwu4/335/base -> origin/gh/ydwu4/335/base 2025-12-04T11:11:09.6409972Z * [new branch] gh/ydwu4/335/head -> origin/gh/ydwu4/335/head 2025-12-04T11:11:09.6410040Z * [new branch] gh/ydwu4/335/orig -> origin/gh/ydwu4/335/orig 2025-12-04T11:11:09.6410109Z * [new branch] gh/ydwu4/337/base -> origin/gh/ydwu4/337/base 2025-12-04T11:11:09.6410185Z * [new branch] gh/ydwu4/337/head -> origin/gh/ydwu4/337/head 2025-12-04T11:11:09.6410254Z * [new branch] gh/ydwu4/337/orig -> origin/gh/ydwu4/337/orig 2025-12-04T11:11:09.6410324Z * [new branch] gh/ydwu4/339/base -> origin/gh/ydwu4/339/base 2025-12-04T11:11:09.6410399Z * [new branch] gh/ydwu4/339/head -> origin/gh/ydwu4/339/head 2025-12-04T11:11:09.6410465Z * [new branch] gh/ydwu4/339/orig -> origin/gh/ydwu4/339/orig 2025-12-04T11:11:09.6410536Z * [new branch] gh/yf225/133/base -> origin/gh/yf225/133/base 2025-12-04T11:11:09.6410608Z * [new branch] gh/yf225/133/head -> origin/gh/yf225/133/head 2025-12-04T11:11:09.6410677Z * [new branch] gh/yf225/93/base -> origin/gh/yf225/93/base 2025-12-04T11:11:09.6410748Z * [new branch] gh/yf225/93/head -> origin/gh/yf225/93/head 2025-12-04T11:11:09.6410829Z * [new branch] gh/yifuwang/152/base -> origin/gh/yifuwang/152/base 2025-12-04T11:11:09.6410905Z * [new branch] gh/yifuwang/152/head -> origin/gh/yifuwang/152/head 2025-12-04T11:11:09.6410982Z * [new branch] gh/yifuwang/152/orig -> origin/gh/yifuwang/152/orig 2025-12-04T11:11:09.6411061Z * [new branch] gh/yifuwang/195/base -> origin/gh/yifuwang/195/base 2025-12-04T11:11:09.6411138Z * [new branch] gh/yifuwang/195/head -> origin/gh/yifuwang/195/head 2025-12-04T11:11:09.6411211Z * [new branch] gh/yifuwang/195/orig -> origin/gh/yifuwang/195/orig 2025-12-04T11:11:09.6411294Z * [new branch] gh/yiming0416/1/base -> origin/gh/yiming0416/1/base 2025-12-04T11:11:09.6411368Z * [new branch] gh/yiming0416/1/head -> origin/gh/yiming0416/1/head 2025-12-04T11:11:09.6411492Z * [new branch] gh/yiming0416/2/base -> origin/gh/yiming0416/2/base 2025-12-04T11:11:09.6411566Z * [new branch] gh/yiming0416/2/head -> origin/gh/yiming0416/2/head 2025-12-04T11:11:09.6411643Z * [new branch] gh/yushangdi/1/base -> origin/gh/yushangdi/1/base 2025-12-04T11:11:09.6411753Z * [new branch] gh/yushangdi/1/head -> origin/gh/yushangdi/1/head 2025-12-04T11:11:09.6411830Z * [new branch] gh/yushangdi/10/base -> origin/gh/yushangdi/10/base 2025-12-04T11:11:09.6411906Z * [new branch] gh/yushangdi/10/head -> origin/gh/yushangdi/10/head 2025-12-04T11:11:09.6411985Z * [new branch] gh/yushangdi/10/orig -> origin/gh/yushangdi/10/orig 2025-12-04T11:11:09.6412060Z * [new branch] gh/yushangdi/11/base -> origin/gh/yushangdi/11/base 2025-12-04T11:11:09.6412134Z * [new branch] gh/yushangdi/11/head -> origin/gh/yushangdi/11/head 2025-12-04T11:11:09.6412216Z * [new branch] gh/yushangdi/11/orig -> origin/gh/yushangdi/11/orig 2025-12-04T11:11:09.6412291Z * [new branch] gh/yushangdi/2/base -> origin/gh/yushangdi/2/base 2025-12-04T11:11:09.6412366Z * [new branch] gh/yushangdi/2/head -> origin/gh/yushangdi/2/head 2025-12-04T11:11:09.6412446Z * [new branch] gh/yushangdi/7/base -> origin/gh/yushangdi/7/base 2025-12-04T11:11:09.6412520Z * [new branch] gh/yushangdi/7/head -> origin/gh/yushangdi/7/head 2025-12-04T11:11:09.6412593Z * [new branch] gh/yushangdi/7/orig -> origin/gh/yushangdi/7/orig 2025-12-04T11:11:09.6412674Z * [new branch] gh/yushangdi/8/base -> origin/gh/yushangdi/8/base 2025-12-04T11:11:09.6412748Z * [new branch] gh/yushangdi/8/head -> origin/gh/yushangdi/8/head 2025-12-04T11:11:09.6412822Z * [new branch] gh/yushangdi/8/orig -> origin/gh/yushangdi/8/orig 2025-12-04T11:11:09.6412902Z * [new branch] gh/yushangdi/9/base -> origin/gh/yushangdi/9/base 2025-12-04T11:11:09.6412977Z * [new branch] gh/yushangdi/9/head -> origin/gh/yushangdi/9/head 2025-12-04T11:11:09.6413050Z * [new branch] gh/yushangdi/9/orig -> origin/gh/yushangdi/9/orig 2025-12-04T11:11:09.6413130Z * [new branch] gh/zklaus/19/base -> origin/gh/zklaus/19/base 2025-12-04T11:11:09.6413203Z * [new branch] gh/zklaus/19/head -> origin/gh/zklaus/19/head 2025-12-04T11:11:09.6413279Z * [new branch] gh/zklaus/19/orig -> origin/gh/zklaus/19/orig 2025-12-04T11:11:09.6413348Z * [new branch] gh/zklaus/20/base -> origin/gh/zklaus/20/base 2025-12-04T11:11:09.6413417Z * [new branch] gh/zklaus/20/head -> origin/gh/zklaus/20/head 2025-12-04T11:11:09.6413491Z * [new branch] gh/zklaus/20/orig -> origin/gh/zklaus/20/orig 2025-12-04T11:11:09.6413561Z * [new branch] gh/zklaus/21/base -> origin/gh/zklaus/21/base 2025-12-04T11:11:09.6413631Z * [new branch] gh/zklaus/21/head -> origin/gh/zklaus/21/head 2025-12-04T11:11:09.6413706Z * [new branch] gh/zklaus/21/orig -> origin/gh/zklaus/21/orig 2025-12-04T11:11:09.6413778Z * [new branch] gh/zklaus/22/base -> origin/gh/zklaus/22/base 2025-12-04T11:11:09.6413848Z * [new branch] gh/zklaus/22/head -> origin/gh/zklaus/22/head 2025-12-04T11:11:09.6413924Z * [new branch] gh/zklaus/22/orig -> origin/gh/zklaus/22/orig 2025-12-04T11:11:09.6413992Z * [new branch] gh/zklaus/23/base -> origin/gh/zklaus/23/base 2025-12-04T11:11:09.6414062Z * [new branch] gh/zklaus/23/head -> origin/gh/zklaus/23/head 2025-12-04T11:11:09.6414134Z * [new branch] gh/zklaus/23/orig -> origin/gh/zklaus/23/orig 2025-12-04T11:11:09.6414224Z * [new branch] gh/zklaus/24/base -> origin/gh/zklaus/24/base 2025-12-04T11:11:09.6414295Z * [new branch] gh/zklaus/24/head -> origin/gh/zklaus/24/head 2025-12-04T11:11:09.6414371Z * [new branch] gh/zklaus/24/orig -> origin/gh/zklaus/24/orig 2025-12-04T11:11:09.6414470Z * [new branch] gh/zou3519/1197/base -> origin/gh/zou3519/1197/base 2025-12-04T11:11:09.6414544Z * [new branch] gh/zou3519/1197/head -> origin/gh/zou3519/1197/head 2025-12-04T11:11:09.6414622Z * [new branch] gh/zou3519/1197/orig -> origin/gh/zou3519/1197/orig 2025-12-04T11:11:09.6414693Z * [new branch] gh/zou3519/1199/base -> origin/gh/zou3519/1199/base 2025-12-04T11:11:09.6414767Z * [new branch] gh/zou3519/1199/head -> origin/gh/zou3519/1199/head 2025-12-04T11:11:09.6414846Z * [new branch] gh/zou3519/1199/orig -> origin/gh/zou3519/1199/orig 2025-12-04T11:11:09.6414922Z * [new branch] gh/zou3519/1200/base -> origin/gh/zou3519/1200/base 2025-12-04T11:11:09.6415002Z * [new branch] gh/zou3519/1200/head -> origin/gh/zou3519/1200/head 2025-12-04T11:11:09.6415077Z * [new branch] gh/zou3519/1200/orig -> origin/gh/zou3519/1200/orig 2025-12-04T11:11:09.6415152Z * [new branch] gh/zou3519/1201/base -> origin/gh/zou3519/1201/base 2025-12-04T11:11:09.6415226Z * [new branch] gh/zou3519/1201/head -> origin/gh/zou3519/1201/head 2025-12-04T11:11:09.6415299Z * [new branch] gh/zou3519/1201/orig -> origin/gh/zou3519/1201/orig 2025-12-04T11:11:09.6415372Z * [new branch] gh/zou3519/1202/base -> origin/gh/zou3519/1202/base 2025-12-04T11:11:09.6415449Z * [new branch] gh/zou3519/1202/head -> origin/gh/zou3519/1202/head 2025-12-04T11:11:09.6415522Z * [new branch] gh/zou3519/1202/orig -> origin/gh/zou3519/1202/orig 2025-12-04T11:11:09.6415599Z * [new branch] gh/zpcore/1/base -> origin/gh/zpcore/1/base 2025-12-04T11:11:09.6415676Z * [new branch] gh/zpcore/1/head -> origin/gh/zpcore/1/head 2025-12-04T11:11:09.6415747Z * [new branch] gh/zpcore/11/base -> origin/gh/zpcore/11/base 2025-12-04T11:11:09.6415818Z * [new branch] gh/zpcore/11/head -> origin/gh/zpcore/11/head 2025-12-04T11:11:09.6415893Z * [new branch] gh/zpcore/11/orig -> origin/gh/zpcore/11/orig 2025-12-04T11:11:09.6415965Z * [new branch] gh/zpcore/12/base -> origin/gh/zpcore/12/base 2025-12-04T11:11:09.6416034Z * [new branch] gh/zpcore/12/head -> origin/gh/zpcore/12/head 2025-12-04T11:11:09.6416106Z * [new branch] gh/zpcore/12/orig -> origin/gh/zpcore/12/orig 2025-12-04T11:11:09.6416176Z * [new branch] gh/zpcore/13/base -> origin/gh/zpcore/13/base 2025-12-04T11:11:09.6416246Z * [new branch] gh/zpcore/13/head -> origin/gh/zpcore/13/head 2025-12-04T11:11:09.6416320Z * [new branch] gh/zpcore/13/orig -> origin/gh/zpcore/13/orig 2025-12-04T11:11:09.6416392Z * [new branch] gh/zpcore/14/base -> origin/gh/zpcore/14/base 2025-12-04T11:11:09.6416463Z * [new branch] gh/zpcore/14/head -> origin/gh/zpcore/14/head 2025-12-04T11:11:09.6416541Z * [new branch] gh/zpcore/14/orig -> origin/gh/zpcore/14/orig 2025-12-04T11:11:09.6416612Z * [new branch] gh/zpcore/15/base -> origin/gh/zpcore/15/base 2025-12-04T11:11:09.6416687Z * [new branch] gh/zpcore/15/head -> origin/gh/zpcore/15/head 2025-12-04T11:11:09.6416756Z * [new branch] gh/zpcore/15/orig -> origin/gh/zpcore/15/orig 2025-12-04T11:11:09.6416828Z * [new branch] gh/zpcore/2/base -> origin/gh/zpcore/2/base 2025-12-04T11:11:09.6416900Z * [new branch] gh/zpcore/2/head -> origin/gh/zpcore/2/head 2025-12-04T11:11:09.6416992Z * [new branch] gh/zpcore/21/base -> origin/gh/zpcore/21/base 2025-12-04T11:11:09.6417064Z * [new branch] gh/zpcore/21/head -> origin/gh/zpcore/21/head 2025-12-04T11:11:09.6417160Z * [new branch] gh/zpcore/21/orig -> origin/gh/zpcore/21/orig 2025-12-04T11:11:09.6417230Z * [new branch] gh/zpcore/22/base -> origin/gh/zpcore/22/base 2025-12-04T11:11:09.6417302Z * [new branch] gh/zpcore/22/head -> origin/gh/zpcore/22/head 2025-12-04T11:11:09.6417377Z * [new branch] gh/zpcore/22/orig -> origin/gh/zpcore/22/orig 2025-12-04T11:11:09.6417448Z * [new branch] gh/zpcore/23/base -> origin/gh/zpcore/23/base 2025-12-04T11:11:09.6417517Z * [new branch] gh/zpcore/23/head -> origin/gh/zpcore/23/head 2025-12-04T11:11:09.6417592Z * [new branch] gh/zpcore/23/orig -> origin/gh/zpcore/23/orig 2025-12-04T11:11:09.6417663Z * [new branch] gh/zpcore/24/base -> origin/gh/zpcore/24/base 2025-12-04T11:11:09.6417733Z * [new branch] gh/zpcore/24/head -> origin/gh/zpcore/24/head 2025-12-04T11:11:09.6417809Z * [new branch] gh/zpcore/24/orig -> origin/gh/zpcore/24/orig 2025-12-04T11:11:09.6417879Z * [new branch] gh/zpcore/25/base -> origin/gh/zpcore/25/base 2025-12-04T11:11:09.6417950Z * [new branch] gh/zpcore/25/head -> origin/gh/zpcore/25/head 2025-12-04T11:11:09.6418025Z * [new branch] gh/zpcore/25/orig -> origin/gh/zpcore/25/orig 2025-12-04T11:11:09.6418095Z * [new branch] gh/zpcore/26/base -> origin/gh/zpcore/26/base 2025-12-04T11:11:09.6418210Z * [new branch] gh/zpcore/26/head -> origin/gh/zpcore/26/head 2025-12-04T11:11:09.6418291Z * [new branch] gh/zpcore/26/orig -> origin/gh/zpcore/26/orig 2025-12-04T11:11:09.6418362Z * [new branch] gh/zpcore/27/base -> origin/gh/zpcore/27/base 2025-12-04T11:11:09.6418439Z * [new branch] gh/zpcore/27/head -> origin/gh/zpcore/27/head 2025-12-04T11:11:09.6418513Z * [new branch] gh/zpcore/27/orig -> origin/gh/zpcore/27/orig 2025-12-04T11:11:09.6418583Z * [new branch] gh/zpcore/28/base -> origin/gh/zpcore/28/base 2025-12-04T11:11:09.6418658Z * [new branch] gh/zpcore/28/head -> origin/gh/zpcore/28/head 2025-12-04T11:11:09.6418726Z * [new branch] gh/zpcore/28/orig -> origin/gh/zpcore/28/orig 2025-12-04T11:11:09.6418794Z * [new branch] gh/zpcore/3/base -> origin/gh/zpcore/3/base 2025-12-04T11:11:09.6418871Z * [new branch] gh/zpcore/3/head -> origin/gh/zpcore/3/head 2025-12-04T11:11:09.6418942Z * [new branch] gh/zpcore/4/base -> origin/gh/zpcore/4/base 2025-12-04T11:11:09.6419013Z * [new branch] gh/zpcore/4/head -> origin/gh/zpcore/4/head 2025-12-04T11:11:09.6419090Z * [new branch] gh/zpcore/5/base -> origin/gh/zpcore/5/base 2025-12-04T11:11:09.6419160Z * [new branch] gh/zpcore/5/head -> origin/gh/zpcore/5/head 2025-12-04T11:11:09.6419230Z * [new branch] gh/zpcore/6/base -> origin/gh/zpcore/6/base 2025-12-04T11:11:09.6419306Z * [new branch] gh/zpcore/6/head -> origin/gh/zpcore/6/head 2025-12-04T11:11:09.6419375Z * [new branch] gh/zpcore/7/base -> origin/gh/zpcore/7/base 2025-12-04T11:11:09.6419443Z * [new branch] gh/zpcore/7/head -> origin/gh/zpcore/7/head 2025-12-04T11:11:09.6419517Z * [new branch] gh/zpcore/8/base -> origin/gh/zpcore/8/base 2025-12-04T11:11:09.6419586Z * [new branch] gh/zpcore/8/head -> origin/gh/zpcore/8/head 2025-12-04T11:11:09.6419686Z * [new branch] google-main -> origin/google-main 2025-12-04T11:11:09.6419781Z * [new branch] guangyey/external_stream -> origin/guangyey/external_stream 2025-12-04T11:11:09.6419858Z * [new branch] guangyey/test_2025 -> origin/guangyey/test_2025 2025-12-04T11:11:09.6420036Z * [new branch] guilhermeleobas/cherry-pick-55d87d9dfd9 -> origin/guilhermeleobas/cherry-pick-55d87d9dfd9 2025-12-04T11:11:09.6420163Z * [new branch] hameerabbasi/complex_tensor_subclass -> origin/hameerabbasi/complex_tensor_subclass 2025-12-04T11:11:09.6420307Z * [new branch] hameerabbasi/fix-ctensor-gradcheck-tests -> origin/hameerabbasi/fix-ctensor-gradcheck-tests 2025-12-04T11:11:09.6420423Z * [new branch] hameerabbasi/gradcheck-allclose -> origin/hameerabbasi/gradcheck-allclose 2025-12-04T11:11:09.6420491Z * [new branch] hc_baseline -> origin/hc_baseline 2025-12-04T11:11:09.6420560Z * [new branch] hhh_rand -> origin/hhh_rand 2025-12-04T11:11:09.6420628Z * [new branch] huba/f1 -> origin/huba/f1 2025-12-04T11:11:09.6420821Z * [new branch] increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test -> origin/increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test 2025-12-04T11:11:09.6420886Z * [new branch] inlining -> origin/inlining 2025-12-04T11:11:09.6420961Z * [new branch] inlining-ezyang -> origin/inlining-ezyang 2025-12-04T11:11:09.6421049Z * [new branch] install-torchao-0.13.0 -> origin/install-torchao-0.13.0 2025-12-04T11:11:09.6421234Z * [new branch] instrument-trunk-pull-linux-with-job-test-filters -> origin/instrument-trunk-pull-linux-with-job-test-filters 2025-12-04T11:11:09.6421313Z * [new branch] invoke-subgraph -> origin/invoke-subgraph 2025-12-04T11:11:09.6421392Z * [new branch] issue#58739 -> origin/issue#58739 2025-12-04T11:11:09.6421477Z * [new branch] jainapurva-patch-1 -> origin/jainapurva-patch-1 2025-12-04T11:11:09.6421548Z * [new branch] jathu/o3 -> origin/jathu/o3 2025-12-04T11:11:09.6421614Z * [new branch] jathu/sve -> origin/jathu/sve 2025-12-04T11:11:09.6421741Z * [new branch] jcaip/test-cusparselt-version-0.6.2 -> origin/jcaip/test-cusparselt-version-0.6.2 2025-12-04T11:11:09.6421852Z * [new branch] jcaip/update-cusparselt-0.6.2 -> origin/jcaip/update-cusparselt-0.6.2 2025-12-04T11:11:09.6421969Z * [new branch] jiannanWang/memorysnapshot_filter -> origin/jiannanWang/memorysnapshot_filter 2025-12-04T11:11:09.6422087Z * [new branch] jiannanWang/profilerstepwarning -> origin/jiannanWang/profilerstepwarning 2025-12-04T11:11:09.6422177Z * [new branch] jithunnair-amd-patch-1 -> origin/jithunnair-amd-patch-1 2025-12-04T11:11:09.6422268Z * [new branch] jithunnair-amd-patch-10 -> origin/jithunnair-amd-patch-10 2025-12-04T11:11:09.6422358Z * [new branch] jithunnair-amd-patch-2 -> origin/jithunnair-amd-patch-2 2025-12-04T11:11:09.6422442Z * [new branch] jithunnair-amd-patch-3 -> origin/jithunnair-amd-patch-3 2025-12-04T11:11:09.6422525Z * [new branch] jithunnair-amd-patch-4 -> origin/jithunnair-amd-patch-4 2025-12-04T11:11:09.6422613Z * [new branch] jithunnair-amd-patch-5 -> origin/jithunnair-amd-patch-5 2025-12-04T11:11:09.6422696Z * [new branch] jithunnair-amd-patch-6 -> origin/jithunnair-amd-patch-6 2025-12-04T11:11:09.6422778Z * [new branch] jithunnair-amd-patch-7 -> origin/jithunnair-amd-patch-7 2025-12-04T11:11:09.6422865Z * [new branch] jithunnair-amd-patch-8 -> origin/jithunnair-amd-patch-8 2025-12-04T11:11:09.6422947Z * [new branch] jithunnair-amd-patch-9 -> origin/jithunnair-amd-patch-9 2025-12-04T11:11:09.6423049Z * [new branch] justinchu/native-qdq -> origin/justinchu/native-qdq 2025-12-04T11:11:09.6423131Z * [new branch] kainan666/xlf_debug -> origin/kainan666/xlf_debug 2025-12-04T11:11:09.6423219Z * [new branch] kainan_test -> origin/kainan_test 2025-12-04T11:11:09.6423302Z * [new branch] larryliu0820-patch-1 -> origin/larryliu0820-patch-1 2025-12-04T11:11:09.6423415Z * [new branch] leslie/test_group_gemm_epilogues -> origin/leslie/test_group_gemm_epilogues 2025-12-04T11:11:09.6423522Z * [new branch] lessw2020/fix_cutlass_cache_error -> origin/lessw2020/fix_cutlass_cache_error 2025-12-04T11:11:09.6423610Z * [new branch] liaoxuan/shm_all_reduce -> origin/liaoxuan/shm_all_reduce 2025-12-04T11:11:09.6423714Z * [new branch] liaoxuan/test_fa_disable_softmax -> origin/liaoxuan/test_fa_disable_softmax 2025-12-04T11:11:09.6423799Z * [new branch] liaoxuan/test_int8_sdpa -> origin/liaoxuan/test_int8_sdpa 2025-12-04T11:11:09.6423872Z * [new branch] llama4-stable -> origin/llama4-stable 2025-12-04T11:11:09.6423938Z * [new branch] lts/release/1.8 -> origin/lts/release/1.8 2025-12-04T11:11:09.6424018Z * [new branch] lucaskabela/#94773 -> origin/lucaskabela/#94773 2025-12-04T11:11:09.6424099Z * [new branch] lucaskabela/fix_164876 -> origin/lucaskabela/fix_164876 2025-12-04T11:11:09.6424190Z * [new branch] lucaskabela/flop_counter -> origin/lucaskabela/flop_counter 2025-12-04T11:11:09.6424290Z * [new branch] lucaskabela/func_under_decomp -> origin/lucaskabela/func_under_decomp 2025-12-04T11:11:09.6424398Z * [new branch] lucaskabela/functional_in_dynamo -> origin/lucaskabela/functional_in_dynamo 2025-12-04T11:11:09.6424530Z * [new branch] lucaskabela/install_params_as_graph_attr -> origin/lucaskabela/install_params_as_graph_attr 2025-12-04T11:11:09.6424645Z * [new branch] lucaskabela/parameters_as_graph_attr -> origin/lucaskabela/parameters_as_graph_attr 2025-12-04T11:11:09.6424778Z * [new branch] lucaskabela/remove_aot_dispatcher_metadata -> origin/lucaskabela/remove_aot_dispatcher_metadata 2025-12-04T11:11:09.6424864Z * [new branch] lucaskabela/rnn_decomp -> origin/lucaskabela/rnn_decomp 2025-12-04T11:11:09.6424959Z * [new branch] lucaskabela/typing_backends -> origin/lucaskabela/typing_backends 2025-12-04T11:11:09.6425059Z * [new branch] lucaskabela/typing_ctx_manager -> origin/lucaskabela/typing_ctx_manager 2025-12-04T11:11:09.6425160Z * [new branch] lucaskabela/typing_nn_module -> origin/lucaskabela/typing_nn_module 2025-12-04T11:11:09.6425263Z * [new branch] lucaskabela/typing_user_defined -> origin/lucaskabela/typing_user_defined 2025-12-04T11:11:09.6425362Z * [new branch] lucaskabela/typing_variables -> origin/lucaskabela/typing_variables 2025-12-04T11:11:09.6425475Z * [new branch] lucaskabela/typing_variables_dicts -> origin/lucaskabela/typing_variables_dicts 2025-12-04T11:11:09.6425600Z * [new branch] lucaskabela/typing_variables_functions -> origin/lucaskabela/typing_variables_functions 2025-12-04T11:11:09.6425712Z * [new branch] lucaskabela/typing_variables_lists -> origin/lucaskabela/typing_variables_lists 2025-12-04T11:11:09.6425789Z * [new branch] lw/torch_box_by_ref -> origin/lw/torch_box_by_ref 2025-12-04T11:11:09.6425855Z * [new branch] main -> origin/main 2025-12-04T11:11:09.6425932Z * [new branch] malfet-patch-1 -> origin/malfet-patch-1 2025-12-04T11:11:09.6426005Z * [new branch] malfet-patch-2 -> origin/malfet-patch-2 2025-12-04T11:11:09.6426074Z * [new branch] malfet-patch-3 -> origin/malfet-patch-3 2025-12-04T11:11:09.6426170Z * [new branch] malfet-patch-4 -> origin/malfet-patch-4 2025-12-04T11:11:09.6426239Z * [new branch] malfet-patch-5 -> origin/malfet-patch-5 2025-12-04T11:11:09.6426328Z * [new branch] malfet-patch-6 -> origin/malfet-patch-6 2025-12-04T11:11:09.6426399Z * [new branch] malfet-patch-7 -> origin/malfet-patch-7 2025-12-04T11:11:09.6426466Z * [new branch] malfet-patch-8 -> origin/malfet-patch-8 2025-12-04T11:11:09.6426542Z * [new branch] malfet/add-3.14-ci -> origin/malfet/add-3.14-ci 2025-12-04T11:11:09.6426710Z * [new branch] malfet/be-do-not-make-typos-in-build-artifacts -> origin/malfet/be-do-not-make-typos-in-build-artifacts 2025-12-04T11:11:09.6426882Z * [new branch] malfet/be-move-more-settings-to-checkout-pytorch -> origin/malfet/be-move-more-settings-to-checkout-pytorch 2025-12-04T11:11:09.6427013Z * [new branch] malfet/be-remove-misisng-neon-headers -> origin/malfet/be-remove-misisng-neon-headers 2025-12-04T11:11:09.6427118Z * [new branch] malfet/mps-implement-col2im -> origin/malfet/mps-implement-col2im 2025-12-04T11:11:09.6427239Z * [new branch] manuel/aoti_metal_shimify-thread_safe -> origin/manuel/aoti_metal_shimify-thread_safe 2025-12-04T11:11:09.6427338Z * [new branch] manuel/inductor_link_openmp -> origin/manuel/inductor_link_openmp 2025-12-04T11:11:09.6427417Z * [new branch] masnesral/metaconda -> origin/masnesral/metaconda 2025-12-04T11:11:09.6427496Z * [new branch] mem_profiler_flaky_fix -> origin/mem_profiler_flaky_fix 2025-12-04T11:11:09.6427583Z * [new branch] mem_profiler_stack_trace -> origin/mem_profiler_stack_trace 2025-12-04T11:11:09.6427661Z * [new branch] memory_profiler_stack -> origin/memory_profiler_stack 2025-12-04T11:11:09.6427739Z * [new branch] metascroy-patch-1 -> origin/metascroy-patch-1 2025-12-04T11:11:09.6427812Z * [new branch] mingw_posix -> origin/mingw_posix 2025-12-04T11:11:09.6427891Z * [new branch] mlazos/S429861-debug -> origin/mlazos/S429861-debug 2025-12-04T11:11:09.6427958Z * [new branch] mlazos/aa -> origin/mlazos/aa 2025-12-04T11:11:09.6428027Z * [new branch] mlazos/acts -> origin/mlazos/acts 2025-12-04T11:11:09.6428102Z * [new branch] mlazos/arg-renames -> origin/mlazos/arg-renames 2025-12-04T11:11:09.6428223Z * [new branch] mlazos/bad-cudagraphs -> origin/mlazos/bad-cudagraphs 2025-12-04T11:11:09.6428332Z * [new branch] mlazos/baseline-graph-breaks -> origin/mlazos/baseline-graph-breaks 2025-12-04T11:11:09.6428408Z * [new branch] mlazos/beta-tensor -> origin/mlazos/beta-tensor 2025-12-04T11:11:09.6428477Z * [new branch] mlazos/buffers -> origin/mlazos/buffers 2025-12-04T11:11:09.6428550Z * [new branch] mlazos/buffers2 -> origin/mlazos/buffers2 2025-12-04T11:11:09.6428620Z * [new branch] mlazos/buffers3 -> origin/mlazos/buffers3 2025-12-04T11:11:09.6428686Z * [new branch] mlazos/bwd -> origin/mlazos/bwd 2025-12-04T11:11:09.6428763Z * [new branch] mlazos/combo-test -> origin/mlazos/combo-test 2025-12-04T11:11:09.6428837Z * [new branch] mlazos/ctx-cleanup -> origin/mlazos/ctx-cleanup 2025-12-04T11:11:09.6428917Z * [new branch] mlazos/cuda-cmd-log -> origin/mlazos/cuda-cmd-log 2025-12-04T11:11:09.6429001Z * [new branch] mlazos/cudagraph-tests -> origin/mlazos/cudagraph-tests 2025-12-04T11:11:09.6429106Z * [new branch] mlazos/cudagraphs-measurement -> origin/mlazos/cudagraphs-measurement 2025-12-04T11:11:09.6429216Z * [new branch] mlazos/cutlass-test -> origin/mlazos/cutlass-test 2025-12-04T11:11:09.6429301Z * [new branch] mlazos/cutlass-topo-bug -> origin/mlazos/cutlass-topo-bug 2025-12-04T11:11:09.6429384Z * [new branch] mlazos/dataclass-proxy -> origin/mlazos/dataclass-proxy 2025-12-04T11:11:09.6429485Z * [new branch] mlazos/dc-attrs -> origin/mlazos/dc-attrs 2025-12-04T11:11:09.6429556Z * [new branch] mlazos/dc-helion -> origin/mlazos/dc-helion 2025-12-04T11:11:09.6429625Z * [new branch] mlazos/dict-fix -> origin/mlazos/dict-fix 2025-12-04T11:11:09.6429702Z * [new branch] mlazos/disable-tf -> origin/mlazos/disable-tf 2025-12-04T11:11:09.6429772Z * [new branch] mlazos/dupe-fix -> origin/mlazos/dupe-fix 2025-12-04T11:11:09.6429842Z * [new branch] mlazos/dyn-batch -> origin/mlazos/dyn-batch 2025-12-04T11:11:09.6429916Z * [new branch] mlazos/evt -> origin/mlazos/evt 2025-12-04T11:11:09.6430001Z * [new branch] mlazos/extract-examples -> origin/mlazos/extract-examples 2025-12-04T11:11:09.6430074Z * [new branch] mlazos/foreach-op -> origin/mlazos/foreach-op 2025-12-04T11:11:09.6430143Z * [new branch] mlazos/fp8 -> origin/mlazos/fp8 2025-12-04T11:11:09.6430212Z * [new branch] mlazos/fp8-bias -> origin/mlazos/fp8-bias 2025-12-04T11:11:09.6430293Z * [new branch] mlazos/fp8-bias-fusion -> origin/mlazos/fp8-bias-fusion 2025-12-04T11:11:09.6430369Z * [new branch] mlazos/fp8-fixes -> origin/mlazos/fp8-fixes 2025-12-04T11:11:09.6430437Z * [new branch] mlazos/freezing -> origin/mlazos/freezing 2025-12-04T11:11:09.6430506Z * [new branch] mlazos/h-comp -> origin/mlazos/h-comp 2025-12-04T11:11:09.6430581Z * [new branch] mlazos/h-comp2 -> origin/mlazos/h-comp2 2025-12-04T11:11:09.6430650Z * [new branch] mlazos/hash-hop -> origin/mlazos/hash-hop 2025-12-04T11:11:09.6430716Z * [new branch] mlazos/hc -> origin/mlazos/hc 2025-12-04T11:11:09.6430789Z * [new branch] mlazos/hc-cycles -> origin/mlazos/hc-cycles 2025-12-04T11:11:09.6430857Z * [new branch] mlazos/hc-fixes -> origin/mlazos/hc-fixes 2025-12-04T11:11:09.6430930Z * [new branch] mlazos/hc-fixes3 -> origin/mlazos/hc-fixes3 2025-12-04T11:11:09.6431000Z * [new branch] mlazos/hc-fixes4 -> origin/mlazos/hc-fixes4 2025-12-04T11:11:09.6431067Z * [new branch] mlazos/hc-hf -> origin/mlazos/hc-hf 2025-12-04T11:11:09.6431137Z * [new branch] mlazos/hc-mut -> origin/mlazos/hc-mut 2025-12-04T11:11:09.6431202Z * [new branch] mlazos/hc10 -> origin/mlazos/hc10 2025-12-04T11:11:09.6431268Z * [new branch] mlazos/hc11 -> origin/mlazos/hc11 2025-12-04T11:11:09.6431334Z * [new branch] mlazos/hc12 -> origin/mlazos/hc12 2025-12-04T11:11:09.6431398Z * [new branch] mlazos/hc13 -> origin/mlazos/hc13 2025-12-04T11:11:09.6431463Z * [new branch] mlazos/hc14 -> origin/mlazos/hc14 2025-12-04T11:11:09.6431529Z * [new branch] mlazos/hc15 -> origin/mlazos/hc15 2025-12-04T11:11:09.6431594Z * [new branch] mlazos/hc2 -> origin/mlazos/hc2 2025-12-04T11:11:09.6431658Z * [new branch] mlazos/hc4 -> origin/mlazos/hc4 2025-12-04T11:11:09.6431724Z * [new branch] mlazos/hc5 -> origin/mlazos/hc5 2025-12-04T11:11:09.6431788Z * [new branch] mlazos/hc6 -> origin/mlazos/hc6 2025-12-04T11:11:09.6431850Z * [new branch] mlazos/hc7 -> origin/mlazos/hc7 2025-12-04T11:11:09.6431939Z * [new branch] mlazos/hc8 -> origin/mlazos/hc8 2025-12-04T11:11:09.6432001Z * [new branch] mlazos/hc9 -> origin/mlazos/hc9 2025-12-04T11:11:09.6432105Z * [new branch] mlazos/hc_baseline2 -> origin/mlazos/hc_baseline2 2025-12-04T11:11:09.6432195Z * [new branch] mlazos/inductor-streams -> origin/mlazos/inductor-streams 2025-12-04T11:11:09.6432259Z * [new branch] mlazos/main -> origin/mlazos/main 2025-12-04T11:11:09.6432323Z * [new branch] mlazos/mcg2 -> origin/mlazos/mcg2 2025-12-04T11:11:09.6432402Z * [new branch] mlazos/meta-guards -> origin/mlazos/meta-guards 2025-12-04T11:11:09.6432508Z * [new branch] mlazos/mlazos/foreach-map-adam -> origin/mlazos/mlazos/foreach-map-adam 2025-12-04T11:11:09.6432607Z * [new branch] mlazos/mlazos/tf-mode-backup -> origin/mlazos/mlazos/tf-mode-backup 2025-12-04T11:11:09.6432686Z * [new branch] mlazos/mod-fix -> origin/mlazos/mod-fix 2025-12-04T11:11:09.6432757Z * [new branch] mlazos/mode-fix -> origin/mlazos/mode-fix 2025-12-04T11:11:09.6432828Z * [new branch] mlazos/offsets -> origin/mlazos/offsets 2025-12-04T11:11:09.6432904Z * [new branch] mlazos/overguarding -> origin/mlazos/overguarding 2025-12-04T11:11:09.6432979Z * [new branch] mlazos/proxy-ctors -> origin/mlazos/proxy-ctors 2025-12-04T11:11:09.6433053Z * [new branch] mlazos/quant-fix -> origin/mlazos/quant-fix 2025-12-04T11:11:09.6433125Z * [new branch] mlazos/resnet-fix -> origin/mlazos/resnet-fix 2025-12-04T11:11:09.6433201Z * [new branch] mlazos/rm-buf-names -> origin/mlazos/rm-buf-names 2025-12-04T11:11:09.6433272Z * [new branch] mlazos/rm-code -> origin/mlazos/rm-code 2025-12-04T11:11:09.6433343Z * [new branch] mlazos/rm-spam -> origin/mlazos/rm-spam 2025-12-04T11:11:09.6433409Z * [new branch] mlazos/rtp -> origin/mlazos/rtp 2025-12-04T11:11:09.6433494Z * [new branch] mlazos/static-idx-dbg -> origin/mlazos/static-idx-dbg 2025-12-04T11:11:09.6433582Z * [new branch] mlazos/static-inputs-log -> origin/mlazos/static-inputs-log 2025-12-04T11:11:09.6433649Z * [new branch] mlazos/stests -> origin/mlazos/stests 2025-12-04T11:11:09.6433726Z * [new branch] mlazos/stream-ops -> origin/mlazos/stream-ops 2025-12-04T11:11:09.6433794Z * [new branch] mlazos/td-fix2 -> origin/mlazos/td-fix2 2025-12-04T11:11:09.6433876Z * [new branch] mlazos/tensor-hasattr2 -> origin/mlazos/tensor-hasattr2 2025-12-04T11:11:09.6433944Z * [new branch] mlazos/test -> origin/mlazos/test 2025-12-04T11:11:09.6434013Z * [new branch] mlazos/tf-mode -> origin/mlazos/tf-mode 2025-12-04T11:11:09.6434094Z * [new branch] mlazos/tf-mode-backup2 -> origin/mlazos/tf-mode-backup2 2025-12-04T11:11:09.6434178Z * [new branch] mlazos/tf-mode-reland -> origin/mlazos/tf-mode-reland 2025-12-04T11:11:09.6434258Z * [new branch] mlazos/tf-mode-reland2 -> origin/mlazos/tf-mode-reland2 2025-12-04T11:11:09.6434338Z * [new branch] mlazos/tf-mode-reland3 -> origin/mlazos/tf-mode-reland3 2025-12-04T11:11:09.6434421Z * [new branch] mlazos/triton-no-epi -> origin/mlazos/triton-no-epi 2025-12-04T11:11:09.6434495Z * [new branch] mlazos/tune-proto -> origin/mlazos/tune-proto 2025-12-04T11:11:09.6434574Z * [new branch] mlazos/tuple-fixes -> origin/mlazos/tuple-fixes 2025-12-04T11:11:09.6434652Z * [new branch] mlazos/tuple-fixes2 -> origin/mlazos/tuple-fixes2 2025-12-04T11:11:09.6434752Z * [new branch] mlazos/tuple-handling -> origin/mlazos/tuple-handling 2025-12-04T11:11:09.6434840Z * [new branch] mlazos/user-stream-base -> origin/mlazos/user-stream-base 2025-12-04T11:11:09.6434938Z * [new branch] mlazos/user-streams -> origin/mlazos/user-streams 2025-12-04T11:11:09.6435032Z * [new branch] mlazos/user-streams-backup -> origin/mlazos/user-streams-backup 2025-12-04T11:11:09.6435133Z * [new branch] mlazos/user-streams-backup2 -> origin/mlazos/user-streams-backup2 2025-12-04T11:11:09.6435205Z * [new branch] mlazos/vary-beta -> origin/mlazos/vary-beta 2025-12-04T11:11:09.6435277Z * [new branch] mlazos/vary-beta2 -> origin/mlazos/vary-beta2 2025-12-04T11:11:09.6435356Z * [new branch] mlazos/weird-perf1 -> origin/mlazos/weird-perf1 2025-12-04T11:11:09.6435431Z * [new branch] mm_out_dtype_compile -> origin/mm_out_dtype_compile 2025-12-04T11:11:09.6435499Z * [new branch] module-shim -> origin/module-shim 2025-12-04T11:11:09.6435567Z * [new branch] move_config -> origin/move_config 2025-12-04T11:11:09.6435641Z * [new branch] msaroufim/reduce -> origin/msaroufim/reduce 2025-12-04T11:11:09.6435713Z * [new branch] mtia/basic-cmake -> origin/mtia/basic-cmake 2025-12-04T11:11:09.6435823Z * [new branch] mwizak/fix-triton-block-shape -> origin/mwizak/fix-triton-block-shape 2025-12-04T11:11:09.6435892Z * [new branch] my_varlen_backup -> origin/my_varlen_backup 2025-12-04T11:11:09.6435969Z * [new branch] nativert_num_outputs -> origin/nativert_num_outputs 2025-12-04T11:11:09.6436041Z * [new branch] new-codegen -> origin/new-codegen 2025-12-04T11:11:09.6436110Z * [new branch] newtest-base -> origin/newtest-base 2025-12-04T11:11:09.6436185Z * [new branch] ngimel/addmm_dtype -> origin/ngimel/addmm_dtype 2025-12-04T11:11:09.6436258Z * [new branch] ngimel/div_inv -> origin/ngimel/div_inv 2025-12-04T11:11:09.6436340Z * [new branch] ngimel/error_index_list -> origin/ngimel/error_index_list 2025-12-04T11:11:09.6436415Z * [new branch] ngimel/gather_grid -> origin/ngimel/gather_grid 2025-12-04T11:11:09.6436507Z * [new branch] ngimel/gather_grid_release -> origin/ngimel/gather_grid_release 2025-12-04T11:11:09.6436574Z * [new branch] ngimel/gg_new -> origin/ngimel/gg_new 2025-12-04T11:11:09.6436648Z * [new branch] ngimel/hostalloc -> origin/ngimel/hostalloc 2025-12-04T11:11:09.6436720Z * [new branch] ngimel/storage_id -> origin/ngimel/storage_id 2025-12-04T11:11:09.6436785Z * [new branch] nightly -> origin/nightly 2025-12-04T11:11:09.6436910Z * [new branch] nikitaved/addmm_1_rowcol_lt_path_check -> origin/nikitaved/addmm_1_rowcol_lt_path_check 2025-12-04T11:11:09.6437036Z * [new branch] nikitaved/addmm_epilogue_fusions_2d_bias -> origin/nikitaved/addmm_epilogue_fusions_2d_bias 2025-12-04T11:11:09.6437167Z * [new branch] nikitaved/addmm_epilogue_fusions_inductor -> origin/nikitaved/addmm_epilogue_fusions_inductor 2025-12-04T11:11:09.6437296Z * [new branch] nikitaved/addmm_epilogue_fusions_scratch -> origin/nikitaved/addmm_epilogue_fusions_scratch 2025-12-04T11:11:09.6437416Z * [new branch] nikitaved/grad_addmm_epilogue_fusions -> origin/nikitaved/grad_addmm_epilogue_fusions 2025-12-04T11:11:09.6437531Z * [new branch] nikitaved/simpler_can_use_32bit_index -> origin/nikitaved/simpler_can_use_32bit_index 2025-12-04T11:11:09.6437605Z * [new branch] nikitaved/test -> origin/nikitaved/test 2025-12-04T11:11:09.6437756Z * [new branch] nmacchioni-perf-test-async-autotune -> origin/nmacchioni-perf-test-async-autotune 2025-12-04T11:11:09.6437838Z * [new branch] no_distributed_log_spew -> origin/no_distributed_log_spew 2025-12-04T11:11:09.6437926Z * [new branch] nofun-hack -> origin/nofun-hack 2025-12-04T11:11:09.6437989Z * [new branch] norm_bench -> origin/norm_bench 2025-12-04T11:11:09.6438071Z * [new branch] nullplay/fuse_matmul -> origin/nullplay/fuse_matmul 2025-12-04T11:11:09.6438184Z * [new branch] nullplay_fuse_matmul -> origin/nullplay_fuse_matmul 2025-12-04T11:11:09.6438254Z * [new branch] optimizer_test -> origin/optimizer_test 2025-12-04T11:11:09.6438330Z * [new branch] orig/release/1.10 -> origin/orig/release/1.10 2025-12-04T11:11:09.6438403Z * [new branch] orig/release/1.11 -> origin/orig/release/1.11 2025-12-04T11:11:09.6438474Z * [new branch] orig/release/1.12 -> origin/orig/release/1.12 2025-12-04T11:11:09.6438547Z * [new branch] orig/release/1.13 -> origin/orig/release/1.13 2025-12-04T11:11:09.6438614Z * [new branch] orig/release/1.6 -> origin/orig/release/1.6 2025-12-04T11:11:09.6438685Z * [new branch] orig/release/1.7 -> origin/orig/release/1.7 2025-12-04T11:11:09.6438752Z * [new branch] orig/release/1.8 -> origin/orig/release/1.8 2025-12-04T11:11:09.6438818Z * [new branch] orig/release/1.9 -> origin/orig/release/1.9 2025-12-04T11:11:09.6438885Z * [new branch] orig/release/2.0 -> origin/orig/release/2.0 2025-12-04T11:11:09.6438950Z * [new branch] orig/release/2.1 -> origin/orig/release/2.1 2025-12-04T11:11:09.6439017Z * [new branch] orig/release/2.2 -> origin/orig/release/2.2 2025-12-04T11:11:09.6439087Z * [new branch] orig/release/2.3 -> origin/orig/release/2.3 2025-12-04T11:11:09.6439153Z * [new branch] orig/release/2.4 -> origin/orig/release/2.4 2025-12-04T11:11:09.6439218Z * [new branch] orig/release/2.5 -> origin/orig/release/2.5 2025-12-04T11:11:09.6439287Z * [new branch] orig/release/2.6 -> origin/orig/release/2.6 2025-12-04T11:11:09.6439352Z * [new branch] orig/release/2.7 -> origin/orig/release/2.7 2025-12-04T11:11:09.6439417Z * [new branch] orig/release/2.8 -> origin/orig/release/2.8 2025-12-04T11:11:09.6439485Z * [new branch] orig/release/2.9 -> origin/orig/release/2.9 2025-12-04T11:11:09.6439573Z * [new branch] origin/gh/fxdawnn/1/base -> origin/origin/gh/fxdawnn/1/base 2025-12-04T11:11:09.6439657Z * [new branch] origin/gh/fxdawnn/1/orig -> origin/origin/gh/fxdawnn/1/orig 2025-12-04T11:11:09.6439742Z * [new branch] origin/gh/zpcore/14/orig -> origin/origin/gh/zpcore/14/orig 2025-12-04T11:11:09.6439813Z * [new branch] oulgen-patch-1 -> origin/oulgen-patch-1 2025-12-04T11:11:09.6439883Z * [new branch] oulgen-patch-2 -> origin/oulgen-patch-2 2025-12-04T11:11:09.6439953Z * [new branch] oulgen-patch-3 -> origin/oulgen-patch-3 2025-12-04T11:11:09.6440022Z * [new branch] oulgen-patch-4 -> origin/oulgen-patch-4 2025-12-04T11:11:09.6440094Z * [new branch] padded-tensor -> origin/padded-tensor 2025-12-04T11:11:09.6440160Z * [new branch] pca2 -> origin/pca2 2025-12-04T11:11:09.6440235Z * [new branch] per_channel_backup -> origin/per_channel_backup 2025-12-04T11:11:09.6440301Z * [new branch] perf_ops -> origin/perf_ops 2025-12-04T11:11:09.6440365Z * [new branch] perf_ops_2_9 -> origin/perf_ops_2_9 2025-12-04T11:11:09.6440471Z * [new branch] pianpwk-patch-1 -> origin/pianpwk-patch-1 2025-12-04T11:11:09.6440562Z * [new branch] pianpwk/__draft_debug_mode -> origin/pianpwk/__draft_debug_mode 2025-12-04T11:11:09.6440699Z * [new branch] pianpwk/_debug_mode_for_triton_draft -> origin/pianpwk/_debug_mode_for_triton_draft 2025-12-04T11:11:09.6440802Z * [new branch] pianpwk/_debug_nn_module_compile -> origin/pianpwk/_debug_nn_module_compile 2025-12-04T11:11:09.6440893Z * [new branch] pianpwk/_draft_triton_11_3 -> origin/pianpwk/_draft_triton_11_3 2025-12-04T11:11:09.6440987Z * [new branch] pianpwk/_manual_bucket_draft -> origin/pianpwk/_manual_bucket_draft 2025-12-04T11:11:09.6441091Z * [new branch] pianpwk/_profile_w_dispatch_keys -> origin/pianpwk/_profile_w_dispatch_keys 2025-12-04T11:11:09.6441191Z * [new branch] pianpwk/_super_draft_debug_mode -> origin/pianpwk/_super_draft_debug_mode 2025-12-04T11:11:09.6441299Z * [new branch] pianpwk/_unbacked_local_shard_size -> origin/pianpwk/_unbacked_local_shard_size 2025-12-04T11:11:09.6441375Z * [new branch] pianpwk/anomaly_tb -> origin/pianpwk/anomaly_tb 2025-12-04T11:11:09.6441463Z * [new branch] pianpwk/auto_fx_annotate -> origin/pianpwk/auto_fx_annotate 2025-12-04T11:11:09.6441578Z * [new branch] pianpwk/backed_size_oblivious_export -> origin/pianpwk/backed_size_oblivious_export 2025-12-04T11:11:09.6441668Z * [new branch] pianpwk/bert_dynamic_perf -> origin/pianpwk/bert_dynamic_perf 2025-12-04T11:11:09.6441766Z * [new branch] pianpwk/debug_fwd_stack_traces -> origin/pianpwk/debug_fwd_stack_traces 2025-12-04T11:11:09.6441852Z * [new branch] pianpwk/debug_hash_tensor -> origin/pianpwk/debug_hash_tensor 2025-12-04T11:11:09.6441944Z * [new branch] pianpwk/debug_mode_annotate -> origin/pianpwk/debug_mode_annotate 2025-12-04T11:11:09.6442035Z * [new branch] pianpwk/debug_mode_defaults -> origin/pianpwk/debug_mode_defaults 2025-12-04T11:11:09.6442119Z * [new branch] pianpwk/debug_mode_hacks -> origin/pianpwk/debug_mode_hacks 2025-12-04T11:11:09.6442230Z * [new branch] pianpwk/debug_mode_opcall_refactor -> origin/pianpwk/debug_mode_opcall_refactor 2025-12-04T11:11:09.6442321Z * [new branch] pianpwk/debug_mode_show_ids -> origin/pianpwk/debug_mode_show_ids 2025-12-04T11:11:09.6442405Z * [new branch] pianpwk/debug_mode_triton -> origin/pianpwk/debug_mode_triton 2025-12-04T11:11:09.6442504Z * [new branch] pianpwk/debug_show_stack_trace -> origin/pianpwk/debug_show_stack_trace 2025-12-04T11:11:09.6442605Z * [new branch] pianpwk/debug_wait_on_collective -> origin/pianpwk/debug_wait_on_collective 2025-12-04T11:11:09.6442704Z * [new branch] pianpwk/debugmode_compile_tf -> origin/pianpwk/debugmode_compile_tf 2025-12-04T11:11:09.6442833Z * [new branch] pianpwk/dispatch_key_debugging_for_debug -> origin/pianpwk/dispatch_key_debugging_for_debug 2025-12-04T11:11:09.6442940Z * [new branch] pianpwk/draft_debug_mode_tfcompile -> origin/pianpwk/draft_debug_mode_tfcompile 2025-12-04T11:11:09.6443040Z * [new branch] pianpwk/draft_multikernel_nn -> origin/pianpwk/draft_multikernel_nn 2025-12-04T11:11:09.6443156Z * [new branch] pianpwk/draft_multikernel_status_10_5 -> origin/pianpwk/draft_multikernel_status_10_5 2025-12-04T11:11:09.6443251Z * [new branch] pianpwk/dtensor_custom_chunk -> origin/pianpwk/dtensor_custom_chunk 2025-12-04T11:11:09.6443357Z * [new branch] pianpwk/dtensor_unbacked_keypath -> origin/pianpwk/dtensor_unbacked_keypath 2025-12-04T11:11:09.6443438Z * [new branch] pianpwk/event_list_tree -> origin/pianpwk/event_list_tree 2025-12-04T11:11:09.6443550Z * [new branch] pianpwk/false_numel_refs -> origin/pianpwk/false_numel_refs 2025-12-04T11:11:09.6443632Z * [new branch] pianpwk/maybe_guard_rel -> origin/pianpwk/maybe_guard_rel 2025-12-04T11:11:09.6443737Z * [new branch] pianpwk/multikernel_hints_draft -> origin/pianpwk/multikernel_hints_draft 2025-12-04T11:11:09.6443874Z * [new branch] pianpwk/no_size_oblivious_slice_scat -> origin/pianpwk/no_size_oblivious_slice_scat 2025-12-04T11:11:09.6443992Z * [new branch] pianpwk/oblivious_reshape_view_better -> origin/pianpwk/oblivious_reshape_view_better 2025-12-04T11:11:09.6444076Z * [new branch] pianpwk/pre_forward_hook -> origin/pianpwk/pre_forward_hook 2025-12-04T11:11:09.6444185Z * [new branch] pianpwk/skip_python_keys_alternate -> origin/pianpwk/skip_python_keys_alternate 2025-12-04T11:11:09.6444293Z * [new branch] pianpwk/skip_python_keys_in_guards -> origin/pianpwk/skip_python_keys_in_guards 2025-12-04T11:11:09.6444377Z * [new branch] pianpwk/sym_tokens_draft -> origin/pianpwk/sym_tokens_draft 2025-12-04T11:11:09.6444460Z * [new branch] pianpwk/symint_one_hot -> origin/pianpwk/symint_one_hot 2025-12-04T11:11:09.6444576Z * [new branch] pianpwk/test_pointwise_guard_or_false -> origin/pianpwk/test_pointwise_guard_or_false 2025-12-04T11:11:09.6444675Z * [new branch] pianpwk/totally_draft_sym_wrap -> origin/pianpwk/totally_draft_sym_wrap 2025-12-04T11:11:09.6444761Z * [new branch] pianpwk/try_dumb_stuff -> origin/pianpwk/try_dumb_stuff 2025-12-04T11:11:09.6444841Z * [new branch] pianpwk/try_dumb_stuff_2 -> origin/pianpwk/try_dumb_stuff_2 2025-12-04T11:11:09.6444933Z * [new branch] pianpwk/unbacked_dtensor_mm -> origin/pianpwk/unbacked_dtensor_mm 2025-12-04T11:11:09.6445031Z * [new branch] pianpwk/unbacked_tracing_12_2 -> origin/pianpwk/unbacked_tracing_12_2 2025-12-04T11:11:09.6445109Z * [new branch] pianpwk/user_symints -> origin/pianpwk/user_symints 2025-12-04T11:11:09.6445188Z * [new branch] pianpwk/wan21_reshape -> origin/pianpwk/wan21_reshape 2025-12-04T11:11:09.6445283Z * [new branch] piz/fix_partial_backward_1112 -> origin/piz/fix_partial_backward_1112 2025-12-04T11:11:09.6445361Z * [new branch] piz/prop_cache_clean -> origin/piz/prop_cache_clean 2025-12-04T11:11:09.6445430Z * [new branch] pool-separate -> origin/pool-separate 2025-12-04T11:11:09.6445496Z * [new branch] pr-156087 -> origin/pr-156087 2025-12-04T11:11:09.6445557Z * [new branch] pr/131860 -> origin/pr/131860 2025-12-04T11:11:09.6445627Z * [new branch] predispatch_to -> origin/predispatch_to 2025-12-04T11:11:09.6445696Z * [new branch] protect-c17 -> origin/protect-c17 2025-12-04T11:11:09.6445767Z * [new branch] pt-opt-cuda3 -> origin/pt-opt-cuda3 2025-12-04T11:11:09.6445851Z * [new branch] python_compiled_autograd -> origin/python_compiled_autograd 2025-12-04T11:11:09.6445985Z * [new branch] q1l1/fix_device_moved_constant_type_unknown -> origin/q1l1/fix_device_moved_constant_type_unknown 2025-12-04T11:11:09.6446127Z * [new branch] q1l1/fix_wrong_default_type_for_kernel_call_args -> origin/q1l1/fix_wrong_default_type_for_kernel_call_args 2025-12-04T11:11:09.6446210Z * [new branch] qchip/export-D54134695 -> origin/qchip/export-D54134695 2025-12-04T11:11:09.6446286Z * [new branch] quote-pytest_cache -> origin/quote-pytest_cache 2025-12-04T11:11:09.6446385Z * [new branch] reland-accgrad-stream-warn -> origin/reland-accgrad-stream-warn 2025-12-04T11:11:09.6446451Z * [new branch] release/1.10 -> origin/release/1.10 2025-12-04T11:11:09.6446544Z * [new branch] release/1.11 -> origin/release/1.11 2025-12-04T11:11:09.6446608Z * [new branch] release/1.12 -> origin/release/1.12 2025-12-04T11:11:09.6446674Z * [new branch] release/1.13 -> origin/release/1.13 2025-12-04T11:11:09.6446759Z * [new branch] release/1.4 -> origin/release/1.4 2025-12-04T11:11:09.6446825Z * [new branch] release/1.4.1 -> origin/release/1.4.1 2025-12-04T11:11:09.6446888Z * [new branch] release/1.5 -> origin/release/1.5 2025-12-04T11:11:09.6446950Z * [new branch] release/1.6 -> origin/release/1.6 2025-12-04T11:11:09.6447012Z * [new branch] release/1.7 -> origin/release/1.7 2025-12-04T11:11:09.6447075Z * [new branch] release/1.8 -> origin/release/1.8 2025-12-04T11:11:09.6447136Z * [new branch] release/1.9 -> origin/release/1.9 2025-12-04T11:11:09.6447198Z * [new branch] release/2.0 -> origin/release/2.0 2025-12-04T11:11:09.6447260Z * [new branch] release/2.1 -> origin/release/2.1 2025-12-04T11:11:09.6447321Z * [new branch] release/2.2 -> origin/release/2.2 2025-12-04T11:11:09.6447383Z * [new branch] release/2.3 -> origin/release/2.3 2025-12-04T11:11:09.6447446Z * [new branch] release/2.4 -> origin/release/2.4 2025-12-04T11:11:09.6447507Z * [new branch] release/2.5 -> origin/release/2.5 2025-12-04T11:11:09.6447567Z * [new branch] release/2.6 -> origin/release/2.6 2025-12-04T11:11:09.6447629Z * [new branch] release/2.7 -> origin/release/2.7 2025-12-04T11:11:09.6447690Z * [new branch] release/2.8 -> origin/release/2.8 2025-12-04T11:11:09.6447750Z * [new branch] release/2.9 -> origin/release/2.9 2025-12-04T11:11:09.6447819Z * [new branch] release_notes -> origin/release_notes 2025-12-04T11:11:09.6447896Z * [new branch] remove_pyinterpreter -> origin/remove_pyinterpreter 2025-12-04T11:11:09.6448024Z * [new branch] replace-pytorch-labs-20250812-195836 -> origin/replace-pytorch-labs-20250812-195836 2025-12-04T11:11:09.6448186Z * [new branch] replace-pytorch-labs-20250812-200248 -> origin/replace-pytorch-labs-20250812-200248 2025-12-04T11:11:09.6448306Z * [new branch] replace-pytorch-labs-20250812-200324 -> origin/replace-pytorch-labs-20250812-200324 2025-12-04T11:11:09.6448426Z * [new branch] replace-pytorch-labs-20250812-204020 -> origin/replace-pytorch-labs-20250812-204020 2025-12-04T11:11:09.6448558Z * [new branch] revert-131069-gh/krzysztofjordan/1/head -> origin/revert-131069-gh/krzysztofjordan/1/head 2025-12-04T11:11:09.6448671Z * [new branch] revert-131469-gh/andrewor14/51/head -> origin/revert-131469-gh/andrewor14/51/head 2025-12-04T11:11:09.6448776Z * [new branch] revert-152361-gh/fadara01/1/head -> origin/revert-152361-gh/fadara01/1/head 2025-12-04T11:11:09.6448880Z * [new branch] revert-156870-gh/skarjala/3/head -> origin/revert-156870-gh/skarjala/3/head 2025-12-04T11:11:09.6449053Z * [new branch] revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ -> origin/revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ 2025-12-04T11:11:09.6449152Z * [new branch] revert-hoo-invoke-subgraph -> origin/revert-hoo-invoke-subgraph 2025-12-04T11:11:09.6449251Z * [new branch] revert_always_build_distributed -> origin/revert_always_build_distributed 2025-12-04T11:11:09.6449322Z * [new branch] rms_norm_patch -> origin/rms_norm_patch 2025-12-04T11:11:09.6449420Z * [new branch] ruisi/fix_all_to_all_estimation -> origin/ruisi/fix_all_to_all_estimation 2025-12-04T11:11:09.6449535Z * [new branch] ruisi/fix_comm_estimation -> origin/ruisi/fix_comm_estimation 2025-12-04T11:11:09.6449644Z * [new branch] ruisi/fix_dynamic_shape_estimation -> origin/ruisi/fix_dynamic_shape_estimation 2025-12-04T11:11:09.6449770Z * [new branch] ruisi/fix_llama3_autobucketing -> origin/ruisi/fix_llama3_autobucketing 2025-12-04T11:11:09.6449875Z * [new branch] ruisi/fix_manual_bucketing_ep_pass -> origin/ruisi/fix_manual_bucketing_ep_pass 2025-12-04T11:11:09.6449959Z * [new branch] ruisi/manual_bucket_pass -> origin/ruisi/manual_bucket_pass 2025-12-04T11:11:09.6450107Z * [new branch] ryanguo99/cleanup-dynamo-expected-failures -> origin/ryanguo99/cleanup-dynamo-expected-failures 2025-12-04T11:11:09.6450195Z * [new branch] ryanguo99/fix-closure-var -> origin/ryanguo99/fix-closure-var 2025-12-04T11:11:09.6450275Z * [new branch] rzou/faketensor_bench -> origin/rzou/faketensor_bench 2025-12-04T11:11:09.6450339Z * [new branch] rzou/njt -> origin/rzou/njt 2025-12-04T11:11:09.6450402Z * [new branch] rzou/pca -> origin/rzou/pca 2025-12-04T11:11:09.6450472Z * [new branch] rzou/realprop -> origin/rzou/realprop 2025-12-04T11:11:09.6450536Z * [new branch] samplevllm -> origin/samplevllm 2025-12-04T11:11:09.6450704Z * [new branch] sanchitintel/weird_thing_with_test_cpu_select_algorithm -> origin/sanchitintel/weird_thing_with_test_cpu_select_algorithm 2025-12-04T11:11:09.6450799Z * [new branch] sapling-pr-archive-SS-JIA -> origin/sapling-pr-archive-SS-JIA 2025-12-04T11:11:09.6450913Z * [new branch] sapling-pr-archive-tushar00jain -> origin/sapling-pr-archive-tushar00jain 2025-12-04T11:11:09.6450974Z * [new branch] save -> origin/save 2025-12-04T11:11:09.6451037Z * [new branch] scaled_mm -> origin/scaled_mm 2025-12-04T11:11:09.6451102Z * [new branch] scan_attempt -> origin/scan_attempt 2025-12-04T11:11:09.6451166Z * [new branch] sdym/2.5.1 -> origin/sdym/2.5.1 2025-12-04T11:11:09.6451279Z * [new branch] sekyondaMeta-dynamoconfig-fix -> origin/sekyondaMeta-dynamoconfig-fix 2025-12-04T11:11:09.6451356Z * [new branch] shengf/fx-xform-perf -> origin/shengf/fx-xform-perf 2025-12-04T11:11:09.6451436Z * [new branch] shoumikhin-patch-1 -> origin/shoumikhin-patch-1 2025-12-04T11:11:09.6451513Z * [new branch] solve-accuracy-fix -> origin/solve-accuracy-fix 2025-12-04T11:11:09.6451594Z * [new branch] some_rocm_inductor_skips -> origin/some_rocm_inductor_skips 2025-12-04T11:11:09.6451676Z * [new branch] soulitzer/stash-tls-ac -> origin/soulitzer/stash-tls-ac 2025-12-04T11:11:09.6451761Z * [new branch] sparse-mm-bf16-support -> origin/sparse-mm-bf16-support 2025-12-04T11:11:09.6451834Z * [new branch] starterTaskUpdate -> origin/starterTaskUpdate 2025-12-04T11:11:09.6451895Z * [new branch] suo -> origin/suo 2025-12-04T11:11:09.6451959Z * [new branch] sve-poc -> origin/sve-poc 2025-12-04T11:11:09.6452022Z * [new branch] switch-bn -> origin/switch-bn 2025-12-04T11:11:09.6452116Z * [new branch] sy_annotation_in_autograd_hop -> origin/sy_annotation_in_autograd_hop 2025-12-04T11:11:09.6452186Z * [new branch] sy_aot_eager_record -> origin/sy_aot_eager_record 2025-12-04T11:11:09.6452256Z * [new branch] sy_custom_bucketing -> origin/sy_custom_bucketing 2025-12-04T11:11:09.6452326Z * [new branch] sy_debug_mode_test -> origin/sy_debug_mode_test 2025-12-04T11:11:09.6452412Z * [new branch] sy_deserialize -> origin/sy_deserialize 2025-12-04T11:11:09.6452479Z * [new branch] sy_dump_gm_code -> origin/sy_dump_gm_code 2025-12-04T11:11:09.6452542Z * [new branch] sy_exp -> origin/sy_exp 2025-12-04T11:11:09.6452637Z * [new branch] sy_export_annotation -> origin/sy_export_annotation 2025-12-04T11:11:09.6452706Z * [new branch] sy_invoke_subgraph -> origin/sy_invoke_subgraph 2025-12-04T11:11:09.6452777Z * [new branch] sy_kernel_bw_name -> origin/sy_kernel_bw_name 2025-12-04T11:11:09.6452843Z * [new branch] sy_multi_arch -> origin/sy_multi_arch 2025-12-04T11:11:09.6452912Z * [new branch] sy_nn_module_stack -> origin/sy_nn_module_stack 2025-12-04T11:11:09.6452983Z * [new branch] sy_original_dtensor -> origin/sy_original_dtensor 2025-12-04T11:11:09.6453049Z * [new branch] sy_profiler_cia -> origin/sy_profiler_cia 2025-12-04T11:11:09.6453117Z * [new branch] symm_mem_sync -> origin/symm_mem_sync 2025-12-04T11:11:09.6453202Z * [new branch] sympy-bottleneck-repro -> origin/sympy-bottleneck-repro 2025-12-04T11:11:09.6453281Z * [new branch] tensordict_integration -> origin/tensordict_integration 2025-12-04T11:11:09.6453363Z * [new branch] test-move-conda-builds -> origin/test-move-conda-builds 2025-12-04T11:11:09.6453425Z * [new branch] test-old -> origin/test-old 2025-12-04T11:11:09.6453489Z * [new branch] test/bmm_heur -> origin/test/bmm_heur 2025-12-04T11:11:09.6453589Z * [new branch] tianren/customOp_autotune_fix -> origin/tianren/customOp_autotune_fix 2025-12-04T11:11:09.6453701Z * [new branch] tianren/customOp_enable_max_autotune -> origin/tianren/customOp_enable_max_autotune 2025-12-04T11:11:09.6453782Z * [new branch] tianren/customOp_fusion -> origin/tianren/customOp_fusion 2025-12-04T11:11:09.6453910Z * [new branch] tianren/customop_collectiveop_benchmark -> origin/tianren/customop_collectiveop_benchmark 2025-12-04T11:11:09.6454047Z * [new branch] tianren/customop_collectiveop_benchmark_fix -> origin/tianren/customop_collectiveop_benchmark_fix 2025-12-04T11:11:09.6454147Z * [new branch] tianren/customop_dynamic_config -> origin/tianren/customop_dynamic_config 2025-12-04T11:11:09.6454240Z * [new branch] tianren/dynamic_range_input -> origin/tianren/dynamic_range_input 2025-12-04T11:11:09.6454341Z * [new branch] tianren/dynamic_range_input_fix -> origin/tianren/dynamic_range_input_fix 2025-12-04T11:11:09.6454451Z * [new branch] tianren/dynamic_range_input_merge -> origin/tianren/dynamic_range_input_merge 2025-12-04T11:11:09.6454555Z * [new branch] tianren/flex_paged_attn_fix_temp -> origin/tianren/flex_paged_attn_fix_temp 2025-12-04T11:11:09.6454638Z * [new branch] tianren/fx_codegen_dump -> origin/tianren/fx_codegen_dump 2025-12-04T11:11:09.6454724Z * [new branch] tianren/symmetric_memory -> origin/tianren/symmetric_memory 2025-12-04T11:11:09.6454791Z * [new branch] tianren/test -> origin/tianren/test 2025-12-04T11:11:09.6454867Z * [new branch] tidy_performance_cyy -> origin/tidy_performance_cyy 2025-12-04T11:11:09.6454929Z * [new branch] tmp -> origin/tmp 2025-12-04T11:11:09.6454996Z * [new branch] torchtitan_ep -> origin/torchtitan_ep 2025-12-04T11:11:09.6455074Z * [new branch] torchtitan_integration -> origin/torchtitan_integration 2025-12-04T11:11:09.6455159Z * [new branch] trace_fsdp_torchtune_lora -> origin/trace_fsdp_torchtune_lora 2025-12-04T11:11:09.6455243Z * [new branch] traceable_fsdp_unit_tests -> origin/traceable_fsdp_unit_tests 2025-12-04T11:11:09.6455349Z * [new branch] tree_loop_vec_base -> origin/tree_loop_vec_base 2025-12-04T11:11:09.6455418Z * [new branch] triton_kernel -> origin/triton_kernel 2025-12-04T11:11:09.6455503Z * [new branch] tt_pkg_1908 -> origin/tt_pkg_1908 2025-12-04T11:11:09.6455565Z * [new branch] type_dec -> origin/type_dec 2025-12-04T11:11:09.6455665Z * [new branch] udate-sphinx-dependancies -> origin/udate-sphinx-dependancies 2025-12-04T11:11:09.6455804Z * [new branch] update-audio-commit-hash/17630256502-1803-1 -> origin/update-audio-commit-hash/17630256502-1803-1 2025-12-04T11:11:09.6455941Z * [new branch] update-audio-commit-hash/19087141161-1916-1 -> origin/update-audio-commit-hash/19087141161-1916-1 2025-12-04T11:11:09.6456076Z * [new branch] update-audio-commit-hash/19250643381-1929-1 -> origin/update-audio-commit-hash/19250643381-1929-1 2025-12-04T11:11:09.6456211Z * [new branch] update-audio-commit-hash/19397724337-1935-1 -> origin/update-audio-commit-hash/19397724337-1935-1 2025-12-04T11:11:09.6456346Z * [new branch] update-audio-commit-hash/19555670148-1941-1 -> origin/update-audio-commit-hash/19555670148-1941-1 2025-12-04T11:11:09.6456478Z * [new branch] update-audio-commit-hash/19750627930-1946-1 -> origin/update-audio-commit-hash/19750627930-1946-1 2025-12-04T11:11:09.6456614Z * [new branch] update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2 2025-12-04T11:11:09.6456752Z * [new branch] update-vision-commit-hash/19087141161-1916-1 -> origin/update-vision-commit-hash/19087141161-1916-1 2025-12-04T11:11:09.6456889Z * [new branch] update-vision-commit-hash/19184897099-1925-1 -> origin/update-vision-commit-hash/19184897099-1925-1 2025-12-04T11:11:09.6457026Z * [new branch] update-vision-commit-hash/19250643381-1929-1 -> origin/update-vision-commit-hash/19250643381-1929-1 2025-12-04T11:11:09.6457159Z * [new branch] update-vision-commit-hash/19381328640-1934-1 -> origin/update-vision-commit-hash/19381328640-1934-1 2025-12-04T11:11:09.6457296Z * [new branch] update-vision-commit-hash/19485237164-1938-1 -> origin/update-vision-commit-hash/19485237164-1938-1 2025-12-04T11:11:09.6457431Z * [new branch] update-vllm-commit-hash/18451675449-1879-1 -> origin/update-vllm-commit-hash/18451675449-1879-1 2025-12-04T11:11:09.6457518Z * [new branch] update-vllm-dockerfile -> origin/update-vllm-dockerfile 2025-12-04T11:11:09.6457644Z * [new branch] update-xla-commit-hash/19224287370-211-1 -> origin/update-xla-commit-hash/19224287370-211-1 2025-12-04T11:11:09.6457772Z * [new branch] update-xla-commit-hash/19422028566-212-1 -> origin/update-xla-commit-hash/19422028566-212-1 2025-12-04T11:11:09.6457894Z * [new branch] update-xla-commit-hash/19626841311-213-1 -> origin/update-xla-commit-hash/19626841311-213-1 2025-12-04T11:11:09.6458022Z * [new branch] update_docs_torch_multinomial_issue#125388 -> origin/update_docs_torch_multinomial_issue#125388 2025-12-04T11:11:09.6458109Z * [new branch] update_operator_readme -> origin/update_operator_readme 2025-12-04T11:11:09.6458235Z * [new branch] update_slow_tests_1722488736 -> origin/update_slow_tests_1722488736 2025-12-04T11:11:09.6458326Z * [new branch] update_slow_tests_1722879173 -> origin/update_slow_tests_1722879173 2025-12-04T11:11:09.6458414Z * [new branch] update_slow_tests_1762155677 -> origin/update_slow_tests_1762155677 2025-12-04T11:11:09.6458501Z * [new branch] update_slow_tests_1763365283 -> origin/update_slow_tests_1763365283 2025-12-04T11:11:09.6458618Z * [new branch] update_submodule_FBGEMM -> origin/update_submodule_FBGEMM 2025-12-04T11:11:09.6458728Z * [new branch] update_submodule_kineto -> origin/update_submodule_kineto 2025-12-04T11:11:09.6458820Z * [new branch] update_submodule_tensorpipe -> origin/update_submodule_tensorpipe 2025-12-04T11:11:09.6458947Z * [new branch] upload-tests-for-autorevert -> origin/upload-tests-for-autorevert 2025-12-04T11:11:09.6459012Z * [new branch] v0.1.2 -> origin/v0.1.2 2025-12-04T11:11:09.6459074Z * [new branch] v1.0.1 -> origin/v1.0.1 2025-12-04T11:11:09.6459136Z * [new branch] v1.0.3 -> origin/v1.0.3 2025-12-04T11:11:09.6459195Z * [new branch] v1.1.0 -> origin/v1.1.0 2025-12-04T11:11:09.6459254Z * [new branch] v1.2.0 -> origin/v1.2.0 2025-12-04T11:11:09.6459312Z * [new branch] v1.3.0 -> origin/v1.3.0 2025-12-04T11:11:09.6459371Z * [new branch] v1.3.1 -> origin/v1.3.1 2025-12-04T11:11:09.6459437Z * [new branch] validate_fn -> origin/validate_fn 2025-12-04T11:11:09.6459509Z * [new branch] validations_2.6 -> origin/validations_2.6 2025-12-04T11:11:09.6459583Z * [new branch] validations_2.8 -> origin/validations_2.8 2025-12-04T11:11:09.6459648Z * [new branch] varlen-api -> origin/varlen-api 2025-12-04T11:11:09.6459728Z * [new branch] varlen-api-backup -> origin/varlen-api-backup 2025-12-04T11:11:09.6459809Z * [new branch] varlen_batch_invariance -> origin/varlen_batch_invariance 2025-12-04T11:11:09.6459875Z * [new branch] viable/strict -> origin/viable/strict 2025-12-04T11:11:09.6459995Z * [new branch] vishal9-team/dtensor_parallelism_toy -> origin/vishal9-team/dtensor_parallelism_toy 2025-12-04T11:11:09.6460062Z * [new branch] vllmbuildci -> origin/vllmbuildci 2025-12-04T11:11:09.6460128Z * [new branch] vllmpin -> origin/vllmpin 2025-12-04T11:11:09.6460222Z * [new branch] vscode-recommend-pyrefly -> origin/vscode-recommend-pyrefly 2025-12-04T11:11:09.6460291Z * [new branch] wdvr-patch-1 -> origin/wdvr-patch-1 2025-12-04T11:11:09.6460358Z * [new branch] wdvr/iss_145259 -> origin/wdvr/iss_145259 2025-12-04T11:11:09.6460424Z * [new branch] whc/pei -> origin/whc/pei 2025-12-04T11:11:09.6460489Z * [new branch] whc/pp_fix -> origin/whc/pp_fix 2025-12-04T11:11:09.6460552Z * [new branch] whc/sharding -> origin/whc/sharding 2025-12-04T11:11:09.6460618Z * [new branch] whc/sharding2 -> origin/whc/sharding2 2025-12-04T11:11:09.6460679Z * [new branch] whc/uneven -> origin/whc/uneven 2025-12-04T11:11:09.6460753Z * [new branch] whc/uneven-merge -> origin/whc/uneven-merge 2025-12-04T11:11:09.6460817Z * [new branch] win_warnings -> origin/win_warnings 2025-12-04T11:11:09.6460895Z * [new branch] windows_libtorch_free -> origin/windows_libtorch_free 2025-12-04T11:11:09.6460961Z * [new branch] xmfan-war -> origin/xmfan-war 2025-12-04T11:11:09.6461027Z * [new branch] xmfan/ca_0516 -> origin/xmfan/ca_0516 2025-12-04T11:11:09.6461097Z * [new branch] xmfan/ca_1051b93192 -> origin/xmfan/ca_1051b93192 2025-12-04T11:11:09.6461251Z * [new branch] xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 -> origin/xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 2025-12-04T11:11:09.6461323Z * [new branch] xmfan/ca_5a2be192d1 -> origin/xmfan/ca_5a2be192d1 2025-12-04T11:11:09.6461393Z * [new branch] xmfan/ca_9d59b516e9 -> origin/xmfan/ca_9d59b516e9 2025-12-04T11:11:09.6462351Z * [new branch] xmfan/ca_apr8 -> origin/xmfan/ca_apr8 2025-12-04T11:11:09.6462422Z * [new branch] xmfan/ca_base -> origin/xmfan/ca_base 2025-12-04T11:11:09.6462511Z * [new branch] xmfan/ca_dynamic -> origin/xmfan/ca_dynamic 2025-12-04T11:11:09.6462579Z * [new branch] xmfan/ca_fix_dyn -> origin/xmfan/ca_fix_dyn 2025-12-04T11:11:09.6462654Z * [new branch] xmfan/ca_fix_lowering -> origin/xmfan/ca_fix_lowering 2025-12-04T11:11:09.6462731Z * [new branch] xmfan/ca_fix_polyfills -> origin/xmfan/ca_fix_polyfills 2025-12-04T11:11:09.6462797Z * [new branch] xmfan/ca_jan3 -> origin/xmfan/ca_jan3 2025-12-04T11:11:09.6462863Z * [new branch] xmfan/ca_jun18 -> origin/xmfan/ca_jun18 2025-12-04T11:11:09.6462929Z * [new branch] xmfan/ca_jun24 -> origin/xmfan/ca_jun24 2025-12-04T11:11:09.6462999Z * [new branch] xmfan/ca_nested -> origin/xmfan/ca_nested 2025-12-04T11:11:09.6463068Z * [new branch] xmfan/ca_overhead -> origin/xmfan/ca_overhead 2025-12-04T11:11:09.6463163Z * [new branch] xmfan/ca_overhead_0eba7e5451 -> origin/xmfan/ca_overhead_0eba7e5451 2025-12-04T11:11:09.6463233Z * [new branch] xmfan/cacu_jun18 -> origin/xmfan/cacu_jun18 2025-12-04T11:11:09.6463301Z * [new branch] xmfan/cacu_jun19 -> origin/xmfan/cacu_jun19 2025-12-04T11:11:09.6463371Z * [new branch] xmfan/cacu_jun4 -> origin/xmfan/cacu_jun4 2025-12-04T11:11:09.6463454Z * [new branch] xmfan/disable_duck_shape -> origin/xmfan/disable_duck_shape 2025-12-04T11:11:09.6463554Z * [new branch] xmfan/fca_cpp_node_passthrough -> origin/xmfan/fca_cpp_node_passthrough 2025-12-04T11:11:09.6463712Z * [new branch] xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 2025-12-04T11:11:09.6463863Z * [new branch] xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 2025-12-04T11:11:09.6463936Z * [new branch] xmfan/single_step -> origin/xmfan/single_step 2025-12-04T11:11:09.6464006Z * [new branch] xmfan/sth_0829 -> origin/xmfan/sth_0829 2025-12-04T11:11:09.6464068Z * [new branch] xmfan/test -> origin/xmfan/test 2025-12-04T11:11:09.6464157Z * [new branch] yguo/debug-0226-constexpr -> origin/yguo/debug-0226-constexpr 2025-12-04T11:11:09.6464238Z * [new branch] yguo/new_latest_changes -> origin/yguo/new_latest_changes 2025-12-04T11:11:09.6464334Z * [new branch] yguo/patch_constexpr_changes -> origin/yguo/patch_constexpr_changes 2025-12-04T11:11:09.6464404Z * [new branch] yiming/bootcamp -> origin/yiming/bootcamp 2025-12-04T11:11:09.6464508Z * [new branch] yiming/run_with_start_end_rng_hop -> origin/yiming/run_with_start_end_rng_hop 2025-12-04T11:11:09.6464576Z * [new branch] yolo-llama3 -> origin/yolo-llama3 2025-12-04T11:11:09.6464650Z * [new branch] zainr/canary-test -> origin/zainr/canary-test 2025-12-04T11:11:09.6464743Z * [new branch] zainr/cleanup-gh-runners -> origin/zainr/cleanup-gh-runners 2025-12-04T11:11:09.6464823Z * [new branch] zainr/pull-migration-c -> origin/zainr/pull-migration-c 2025-12-04T11:11:09.6464889Z * [new branch] zainr/test2 -> origin/zainr/test2 2025-12-04T11:11:09.6464963Z * [new branch] zasdfgbnm-patch-3 -> origin/zasdfgbnm-patch-3 2025-12-04T11:11:09.6465023Z * [new branch] zb2p -> origin/zb2p 2025-12-04T11:11:09.6465132Z * [new branch] zeros-and-scatter-part2 -> origin/zeros-and-scatter-part2 2025-12-04T11:11:09.6465222Z * [new branch] zhxchen17/ci/vllm_lora_oom -> origin/zhxchen17/ci/vllm_lora_oom 2025-12-04T11:11:09.6465327Z * [new branch] zhxchen17/ci/vllm_multimodal_oom -> origin/zhxchen17/ci/vllm_multimodal_oom 2025-12-04T11:11:09.6465431Z * [new branch] zhxchen17/ci/vllm_pin -> origin/zhxchen17/ci/vllm_pin 2025-12-04T11:11:09.6465556Z * [new branch] zhxchen17/dynamo/unsafe_drop_all_guards -> origin/zhxchen17/dynamo/unsafe_drop_all_guards 2025-12-04T11:11:09.6465656Z * [new branch] zhxchen17/export/call_override -> origin/zhxchen17/export/call_override 2025-12-04T11:11:09.6465747Z * [new branch] zhxchen17/export/codemod1 -> origin/zhxchen17/export/codemod1 2025-12-04T11:11:09.6465838Z * [new branch] zhxchen17/export/ctx_return -> origin/zhxchen17/export/ctx_return 2025-12-04T11:11:09.6465967Z * [new branch] zhxchen17/export/disable_side_effect_warn -> origin/zhxchen17/export/disable_side_effect_warn 2025-12-04T11:11:09.6466068Z * [new branch] zhxchen17/export/pytree_check -> origin/zhxchen17/export/pytree_check 2025-12-04T11:11:09.6466158Z * [new branch] zhxchen17/precompile/aoti -> origin/zhxchen17/precompile/aoti 2025-12-04T11:11:09.6466255Z * [new branch] zhxchen17/precompile/globals -> origin/zhxchen17/precompile/globals 2025-12-04T11:11:09.6466374Z * [new branch] zhxchen17/precompile/inductor_guards -> origin/zhxchen17/precompile/inductor_guards 2025-12-04T11:11:09.6466449Z * [new branch] zhxchen17/scratch/0 -> origin/zhxchen17/scratch/0 2025-12-04T11:11:09.6466556Z * [new branch] zhxchen17/torch_export_api_update -> origin/zhxchen17/torch_export_api_update 2025-12-04T11:11:09.6466633Z * [new branch] zhxhcen17/moodycamel -> origin/zhxhcen17/moodycamel 2025-12-04T11:11:09.6466710Z * [new branch] zxiiro/build-times -> origin/zxiiro/build-times 2025-12-04T11:11:09.6466787Z * [new branch] zxiiro/c7i.2xlarge -> origin/zxiiro/c7i.2xlarge 2025-12-04T11:11:09.6466867Z * [new branch] zxiiro/c7i.2xlarge.h100 -> origin/zxiiro/c7i.2xlarge.h100 2025-12-04T11:11:09.6466932Z * [new branch] zxiiro/main -> origin/zxiiro/main 2025-12-04T11:11:09.6466998Z * [new branch] zxiiro/risc64 -> origin/zxiiro/risc64 2025-12-04T11:11:09.6467091Z * [new branch] zxiiro/test-multicloud-arc -> origin/zxiiro/test-multicloud-arc 2025-12-04T11:11:09.8546655Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object} 2025-12-04T11:11:09.8714681Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:11:09.8719554Z ##[endgroup] 2025-12-04T11:11:09.8720037Z ##[group]Determining the checkout info 2025-12-04T11:11:09.8720413Z ##[endgroup] 2025-12-04T11:11:09.8724549Z [command]/usr/bin/git sparse-checkout disable 2025-12-04T11:11:09.8817988Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-12-04T11:11:09.8833725Z ##[group]Checking out the ref 2025-12-04T11:11:09.8835213Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:11:09.9954002Z Previous HEAD position was c0cb6e784044 [DTensor] ExplicitRedistributionContext warning mode (#169452) 2025-12-04T11:11:09.9959939Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T11:11:10.0074094Z ##[endgroup] 2025-12-04T11:11:10.0074536Z ##[group]Setting up auth for fetching submodules 2025-12-04T11:11:10.0080714Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T11:11:10.0128004Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-12-04T11:11:10.0153970Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-12-04T11:11:10.0178265Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-12-04T11:11:10.0197040Z ##[endgroup] 2025-12-04T11:11:10.0197369Z ##[group]Fetching submodules 2025-12-04T11:11:10.0201065Z [command]/usr/bin/git submodule sync --recursive 2025-12-04T11:11:10.0366571Z Synchronizing submodule url for 'android/libs/fbjni' 2025-12-04T11:11:10.0379860Z Synchronizing submodule url for 'third_party/FP16' 2025-12-04T11:11:10.0390751Z Synchronizing submodule url for 'third_party/FXdiv' 2025-12-04T11:11:10.0401268Z Synchronizing submodule url for 'third_party/NNPACK' 2025-12-04T11:11:10.0412051Z Synchronizing submodule url for 'third_party/NVTX' 2025-12-04T11:11:10.0422980Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:10.0433528Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-12-04T11:11:10.0451966Z Synchronizing submodule url for 'third_party/aiter' 2025-12-04T11:11:10.0463721Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:10.0481174Z Synchronizing submodule url for 'third_party/benchmark' 2025-12-04T11:11:10.0492397Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-12-04T11:11:10.0509976Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-12-04T11:11:10.0522455Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-12-04T11:11:10.0533456Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-12-04T11:11:10.0543958Z Synchronizing submodule url for 'third_party/cutlass' 2025-12-04T11:11:10.0560130Z Synchronizing submodule url for 'third_party/fbgemm' 2025-12-04T11:11:10.0572578Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:10.0583534Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:10.0601903Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:10.0612880Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:10.0629828Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:10.0642558Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:10.0653531Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-12-04T11:11:10.0666725Z Synchronizing submodule url for 'third_party/flash-attention' 2025-12-04T11:11:10.0678272Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:10.0693814Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:10.0710131Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-12-04T11:11:10.0722923Z Synchronizing submodule url for 'third_party/fmt' 2025-12-04T11:11:10.0734681Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:10.0746766Z Synchronizing submodule url for 'third_party/gloo' 2025-12-04T11:11:10.0767648Z Synchronizing submodule url for 'third_party/googletest' 2025-12-04T11:11:10.0782843Z Synchronizing submodule url for 'third_party/ideep' 2025-12-04T11:11:10.0794590Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:10.0813301Z Synchronizing submodule url for 'third_party/ittapi' 2025-12-04T11:11:10.0826451Z Synchronizing submodule url for 'third_party/kineto' 2025-12-04T11:11:10.0838865Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:10.0850682Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:10.0863190Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:10.0874446Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:10.0885364Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:10.0898459Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:10.0910663Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:10.0921964Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:10.0932764Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:10.0944044Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:10.0954524Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:10.0975671Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:10.0990660Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:10.1008426Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:10.1019912Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:10.1032836Z Synchronizing submodule url for 'third_party/kleidiai' 2025-12-04T11:11:10.1045502Z Synchronizing submodule url for 'third_party/mimalloc' 2025-12-04T11:11:10.1056181Z Synchronizing submodule url for 'third_party/nlohmann' 2025-12-04T11:11:10.1067503Z Synchronizing submodule url for 'third_party/onnx' 2025-12-04T11:11:10.1086896Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:10.1100641Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-12-04T11:11:10.1111721Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:10.1123274Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:10.1135918Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:10.1148464Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:10.1172509Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:10.1185882Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:10.1195828Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:10.1206019Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:10.1218286Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:10.1231368Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:10.1252563Z Synchronizing submodule url for 'third_party/pocketfft' 2025-12-04T11:11:10.1264315Z Synchronizing submodule url for 'third_party/protobuf' 2025-12-04T11:11:10.1276429Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:10.1287355Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:10.1302165Z Synchronizing submodule url for 'third_party/psimd' 2025-12-04T11:11:10.1313317Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-12-04T11:11:10.1323872Z Synchronizing submodule url for 'third_party/pybind11' 2025-12-04T11:11:10.1334799Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-12-04T11:11:10.1345971Z Synchronizing submodule url for 'third_party/sleef' 2025-12-04T11:11:10.1356530Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-12-04T11:11:10.1366666Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:10.1379533Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:10.1390433Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:10.1403032Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:10.1413193Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:10.1435727Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-12-04T11:11:10.1672011Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-12-04T11:11:10.1752004Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-12-04T11:11:10.1801493Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-12-04T11:11:10.1917552Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-12-04T11:11:10.1984050Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6' 2025-12-04T11:11:10.2046363Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-12-04T11:11:10.7272902Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-12-04T11:11:10.7422333Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-12-04T11:11:10.7629576Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-12-04T11:11:10.7752030Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-12-04T11:11:10.7923666Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T11:11:10.8010375Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-12-04T11:11:10.8699755Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc' 2025-12-04T11:11:10.8785452Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396' 2025-12-04T11:11:10.8911723Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588' 2025-12-04T11:11:10.9848689Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4' 2025-12-04T11:11:11.0251221Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-12-04T11:11:11.2113170Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T11:11:11.2793749Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-12-04T11:11:11.4030803Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8' 2025-12-04T11:11:11.4229403Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T11:11:11.4297369Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-12-04T11:11:11.4834449Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-12-04T11:11:11.4932498Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-12-04T11:11:11.5110640Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-12-04T11:11:11.5216773Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-12-04T11:11:11.5307079Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-12-04T11:11:11.5454668Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f' 2025-12-04T11:11:11.5655147Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-12-04T11:11:11.5768981Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-12-04T11:11:11.5958114Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T11:11:11.6050379Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-12-04T11:11:11.7063402Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-12-04T11:11:11.7143105Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-12-04T11:11:11.7215409Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943' 2025-12-04T11:11:11.7295286Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-12-04T11:11:11.7391520Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-12-04T11:11:11.7446703Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-12-04T11:11:11.7511638Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-12-04T11:11:11.7580090Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-12-04T11:11:11.7637152Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-12-04T11:11:11.7698874Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-12-04T11:11:11.7762082Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T11:11:11.7846601Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-12-04T11:11:11.7899036Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-12-04T11:11:11.7954564Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-12-04T11:11:11.8029001Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-12-04T11:11:11.8083876Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T11:11:11.8154579Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-12-04T11:11:11.8209117Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T11:11:11.8287483Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe' 2025-12-04T11:11:11.8366208Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-12-04T11:11:11.8471254Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-12-04T11:11:12.0254838Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-12-04T11:11:12.0449323Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-12-04T11:11:12.0562136Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-12-04T11:11:12.0625523Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-12-04T11:11:12.0698477Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-12-04T11:11:12.0755630Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-12-04T11:11:12.0846109Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-12-04T11:11:12.0923044Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-12-04T11:11:12.1017791Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-12-04T11:11:12.1174865Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-12-04T11:11:12.1259565Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-12-04T11:11:12.1314352Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T11:11:12.1482493Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-12-04T11:11:12.1550859Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-12-04T11:11:12.3211915Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-12-04T11:11:12.3308903Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-12-04T11:11:12.3524868Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-12-04T11:11:12.3583810Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-12-04T11:11:12.3662798Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-12-04T11:11:12.3918327Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-12-04T11:11:12.4160282Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-12-04T11:11:12.4571601Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-12-04T11:11:12.4678503Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d' 2025-12-04T11:11:12.4876054Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-12-04T11:11:12.4957474Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-12-04T11:11:12.5239889Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-12-04T11:11:12.5358981Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-12-04T11:11:12.5416104Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-12-04T11:11:12.5439818Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-12-04T11:11:12.5627125Z Entering 'android/libs/fbjni' 2025-12-04T11:11:12.5652011Z Entering 'third_party/FP16' 2025-12-04T11:11:12.5675592Z Entering 'third_party/FXdiv' 2025-12-04T11:11:12.5700219Z Entering 'third_party/NNPACK' 2025-12-04T11:11:12.5724551Z Entering 'third_party/NVTX' 2025-12-04T11:11:12.5750185Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:12.5772786Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:12.5801671Z Entering 'third_party/aiter' 2025-12-04T11:11:12.5825317Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:12.5855548Z Entering 'third_party/benchmark' 2025-12-04T11:11:12.5879542Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:12.5906620Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:12.5929222Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:12.5954981Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:12.5977595Z Entering 'third_party/cutlass' 2025-12-04T11:11:12.6004582Z Entering 'third_party/fbgemm' 2025-12-04T11:11:12.6029596Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:12.6048238Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:12.6073818Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:12.6094452Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:12.6119792Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:12.6141785Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:12.6164282Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:12.6187049Z Entering 'third_party/flash-attention' 2025-12-04T11:11:12.6209591Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:12.6231902Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:12.6257454Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:12.6280774Z Entering 'third_party/fmt' 2025-12-04T11:11:12.6302446Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:12.6324943Z Entering 'third_party/gloo' 2025-12-04T11:11:12.6347050Z Entering 'third_party/googletest' 2025-12-04T11:11:12.6368434Z Entering 'third_party/ideep' 2025-12-04T11:11:12.6390561Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:12.6414712Z Entering 'third_party/ittapi' 2025-12-04T11:11:12.6437094Z Entering 'third_party/kineto' 2025-12-04T11:11:12.6458848Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:12.6479665Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:12.6501422Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:12.6523391Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:12.6544264Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:12.6572068Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:12.6595738Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:12.6616127Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:12.6637582Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:12.6658459Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:12.6680028Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:12.6701506Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:12.6722550Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:12.6747344Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:12.6767993Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:12.6790218Z Entering 'third_party/kleidiai' 2025-12-04T11:11:12.6812931Z Entering 'third_party/mimalloc' 2025-12-04T11:11:12.6834658Z Entering 'third_party/nlohmann' 2025-12-04T11:11:12.6857280Z Entering 'third_party/onnx' 2025-12-04T11:11:12.6887117Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:12.6915538Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:12.6939644Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:12.6960389Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:12.6981779Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:12.7003123Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:12.7025284Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:12.7054476Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:12.7081008Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:12.7101309Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:12.7125936Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:12.7155452Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:12.7183200Z Entering 'third_party/pocketfft' 2025-12-04T11:11:12.7206406Z Entering 'third_party/protobuf' 2025-12-04T11:11:12.7233061Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:12.7255194Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:12.7278918Z Entering 'third_party/psimd' 2025-12-04T11:11:12.7301499Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:12.7327290Z Entering 'third_party/pybind11' 2025-12-04T11:11:12.7348917Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:12.7372917Z Entering 'third_party/sleef' 2025-12-04T11:11:12.7396252Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:12.7418412Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:12.7439184Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:12.7460214Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:12.7480746Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:12.7501945Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:12.7535691Z ##[endgroup] 2025-12-04T11:11:12.7535890Z ##[group]Persisting credentials for submodules 2025-12-04T11:11:12.7541046Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-12-04T11:11:12.7700192Z Entering 'android/libs/fbjni' 2025-12-04T11:11:12.7725264Z Entering 'third_party/FP16' 2025-12-04T11:11:12.7753733Z Entering 'third_party/FXdiv' 2025-12-04T11:11:12.7778265Z Entering 'third_party/NNPACK' 2025-12-04T11:11:12.7804288Z Entering 'third_party/NVTX' 2025-12-04T11:11:12.7829500Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:12.7853137Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:12.7883166Z Entering 'third_party/aiter' 2025-12-04T11:11:12.7909455Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:12.7938121Z Entering 'third_party/benchmark' 2025-12-04T11:11:12.7970759Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:12.7999975Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:12.8024369Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:12.8049193Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:12.8074337Z Entering 'third_party/cutlass' 2025-12-04T11:11:12.8102799Z Entering 'third_party/fbgemm' 2025-12-04T11:11:12.8129327Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:12.8151565Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:12.8177814Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:12.8200870Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:12.8229391Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:12.8253422Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:12.8276907Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:12.8306047Z Entering 'third_party/flash-attention' 2025-12-04T11:11:12.8331759Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:12.8361189Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:12.8391338Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:12.8416655Z Entering 'third_party/fmt' 2025-12-04T11:11:12.8441921Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:12.8466455Z Entering 'third_party/gloo' 2025-12-04T11:11:12.8491562Z Entering 'third_party/googletest' 2025-12-04T11:11:12.8517962Z Entering 'third_party/ideep' 2025-12-04T11:11:12.8543705Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:12.8571331Z Entering 'third_party/ittapi' 2025-12-04T11:11:12.8600427Z Entering 'third_party/kineto' 2025-12-04T11:11:12.8624099Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:12.8648625Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:12.8672604Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:12.8696615Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:12.8720922Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:12.8744619Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:12.8770393Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:12.8793492Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:12.8816155Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:12.8841162Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:12.8864320Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:12.8888139Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:12.8915323Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:12.8943812Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:12.8967538Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:12.8993428Z Entering 'third_party/kleidiai' 2025-12-04T11:11:12.9017277Z Entering 'third_party/mimalloc' 2025-12-04T11:11:12.9043379Z Entering 'third_party/nlohmann' 2025-12-04T11:11:12.9067875Z Entering 'third_party/onnx' 2025-12-04T11:11:12.9098289Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:12.9129695Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:12.9153821Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:12.9181634Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:12.9209617Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:12.9255991Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:12.9291033Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:12.9314896Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:12.9343434Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:12.9367252Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:12.9400550Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:12.9426764Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:12.9458740Z Entering 'third_party/pocketfft' 2025-12-04T11:11:12.9483307Z Entering 'third_party/protobuf' 2025-12-04T11:11:12.9513354Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:12.9537444Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:12.9565271Z Entering 'third_party/psimd' 2025-12-04T11:11:12.9590472Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:12.9616556Z Entering 'third_party/pybind11' 2025-12-04T11:11:12.9641355Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:12.9665953Z Entering 'third_party/sleef' 2025-12-04T11:11:12.9690790Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:12.9718590Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:12.9740731Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:12.9764433Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:12.9792495Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:12.9816037Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:12.9853544Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-12-04T11:11:13.0039428Z Entering 'android/libs/fbjni' 2025-12-04T11:11:13.0062575Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T11:11:13.0073513Z Entering 'third_party/FP16' 2025-12-04T11:11:13.0099948Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T11:11:13.0110972Z Entering 'third_party/FXdiv' 2025-12-04T11:11:13.0132257Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T11:11:13.0142578Z Entering 'third_party/NNPACK' 2025-12-04T11:11:13.0166292Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T11:11:13.0176876Z Entering 'third_party/NVTX' 2025-12-04T11:11:13.0197372Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T11:11:13.0207955Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:13.0230923Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T11:11:13.0241474Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:13.0262802Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T11:11:13.0277546Z Entering 'third_party/aiter' 2025-12-04T11:11:13.0302593Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T11:11:13.0314898Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:13.0336364Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T11:11:13.0354199Z Entering 'third_party/benchmark' 2025-12-04T11:11:13.0374823Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:13.0385482Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:13.0407757Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T11:11:13.0420514Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:13.0440968Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T11:11:13.0451435Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:13.0474225Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T11:11:13.0484939Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:13.0506273Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T11:11:13.0518718Z Entering 'third_party/cutlass' 2025-12-04T11:11:13.0570377Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T11:11:13.0635841Z Entering 'third_party/fbgemm' 2025-12-04T11:11:13.0669571Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T11:11:13.0704285Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:13.0725442Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T11:11:13.0732281Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:13.0758677Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T11:11:13.0775893Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:13.0801016Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T11:11:13.0808898Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:13.0824927Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T11:11:13.0835633Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:13.0853996Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T11:11:13.0862151Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:13.0879974Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T11:11:13.0889004Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:13.0906447Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T11:11:13.0920075Z Entering 'third_party/flash-attention' 2025-12-04T11:11:13.0941435Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T11:11:13.0953071Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:13.0973189Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T11:11:13.0984645Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:13.1003546Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T11:11:13.1020027Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:13.1041354Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T11:11:13.1054774Z Entering 'third_party/fmt' 2025-12-04T11:11:13.1074764Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T11:11:13.1086033Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:13.1106598Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T11:11:13.1117150Z Entering 'third_party/gloo' 2025-12-04T11:11:13.1137787Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T11:11:13.1149606Z Entering 'third_party/googletest' 2025-12-04T11:11:13.1169038Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:13.1180390Z Entering 'third_party/ideep' 2025-12-04T11:11:13.1199446Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T11:11:13.1209547Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:13.1227776Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T11:11:13.1240955Z Entering 'third_party/ittapi' 2025-12-04T11:11:13.1260798Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T11:11:13.1272037Z Entering 'third_party/kineto' 2025-12-04T11:11:13.1291628Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T11:11:13.1302689Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:13.1320839Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T11:11:13.1329209Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:13.1350304Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T11:11:13.1359688Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:13.1378577Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T11:11:13.1386958Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:13.1406092Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T11:11:13.1414758Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:13.1434251Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T11:11:13.1442189Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:13.1461739Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T11:11:13.1471132Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:13.1490043Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T11:11:13.1499021Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:13.1517405Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:13.1527939Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:13.1547579Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T11:11:13.1556946Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:13.1575678Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T11:11:13.1584464Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:13.1602742Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T11:11:13.1611252Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:13.1630596Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T11:11:13.1640599Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:13.1659718Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T11:11:13.1669993Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:13.1689168Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T11:11:13.1697441Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:13.1716716Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T11:11:13.1729724Z Entering 'third_party/kleidiai' 2025-12-04T11:11:13.1755346Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T11:11:13.1770993Z Entering 'third_party/mimalloc' 2025-12-04T11:11:13.1795150Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T11:11:13.1806894Z Entering 'third_party/nlohmann' 2025-12-04T11:11:13.1827686Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T11:11:13.1840070Z Entering 'third_party/onnx' 2025-12-04T11:11:13.1860402Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T11:11:13.1879469Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:13.1899490Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:13.1910423Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:13.1931521Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T11:11:13.1942802Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:13.1961683Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:13.1970257Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:13.1988328Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:13.1996429Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:13.2014817Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T11:11:13.2023624Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:13.2041577Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T11:11:13.2050843Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:13.2068547Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T11:11:13.2076760Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:13.2094999Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T11:11:13.2103311Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:13.2121254Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T11:11:13.2129622Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:13.2147859Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T11:11:13.2157243Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:13.2175412Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T11:11:13.2185025Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:13.2203138Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T11:11:13.2221573Z Entering 'third_party/pocketfft' 2025-12-04T11:11:13.2241056Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T11:11:13.2251338Z Entering 'third_party/protobuf' 2025-12-04T11:11:13.2269678Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T11:11:13.2280731Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:13.2298677Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:13.2307056Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:13.2325309Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:13.2335522Z Entering 'third_party/psimd' 2025-12-04T11:11:13.2355681Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T11:11:13.2366195Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:13.2385269Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T11:11:13.2395259Z Entering 'third_party/pybind11' 2025-12-04T11:11:13.2413860Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:13.2424576Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:13.2443411Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T11:11:13.2453793Z Entering 'third_party/sleef' 2025-12-04T11:11:13.2472549Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T11:11:13.2483437Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:13.2502554Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T11:11:13.2512850Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:13.2531895Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:13.2540632Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:13.2558434Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T11:11:13.2566903Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:13.2587530Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T11:11:13.2596420Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:13.2615040Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:13.2622875Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:13.2642238Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T11:11:13.2840763Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-12-04T11:11:13.3026272Z Entering 'android/libs/fbjni' 2025-12-04T11:11:13.3047339Z Entering 'third_party/FP16' 2025-12-04T11:11:13.3070341Z Entering 'third_party/FXdiv' 2025-12-04T11:11:13.3092509Z Entering 'third_party/NNPACK' 2025-12-04T11:11:13.3114198Z Entering 'third_party/NVTX' 2025-12-04T11:11:13.3136707Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:13.3157204Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:13.3184067Z Entering 'third_party/aiter' 2025-12-04T11:11:13.3206096Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:13.3229827Z Entering 'third_party/benchmark' 2025-12-04T11:11:13.3251040Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:13.3281361Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:13.3302406Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:13.3324533Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:13.3346978Z Entering 'third_party/cutlass' 2025-12-04T11:11:13.3383016Z Entering 'third_party/fbgemm' 2025-12-04T11:11:13.3411734Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:13.3430422Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:13.3451760Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:13.3470275Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:13.3488044Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:13.3503061Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:13.3533005Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:13.3566115Z Entering 'third_party/flash-attention' 2025-12-04T11:11:13.3597855Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:13.3625646Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:13.3657424Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:13.3683735Z Entering 'third_party/fmt' 2025-12-04T11:11:13.3707717Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:13.3742577Z Entering 'third_party/gloo' 2025-12-04T11:11:13.3766634Z Entering 'third_party/googletest' 2025-12-04T11:11:13.3789049Z Entering 'third_party/ideep' 2025-12-04T11:11:13.3822079Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:13.3852838Z Entering 'third_party/ittapi' 2025-12-04T11:11:13.3879326Z Entering 'third_party/kineto' 2025-12-04T11:11:13.3907353Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:13.3948052Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:13.3970736Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:13.3986052Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:13.4004528Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:13.4023663Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:13.4053182Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:13.4069971Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:13.4086747Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:13.4106252Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:13.4126186Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:13.4154481Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:13.4172426Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:13.4203336Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:13.4241729Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:13.4274681Z Entering 'third_party/kleidiai' 2025-12-04T11:11:13.4302987Z Entering 'third_party/mimalloc' 2025-12-04T11:11:13.4331232Z Entering 'third_party/nlohmann' 2025-12-04T11:11:13.4359423Z Entering 'third_party/onnx' 2025-12-04T11:11:13.4389963Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:13.4423514Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:13.4447479Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:13.4469436Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:13.4492169Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:13.4516450Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:13.4538228Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:13.4564685Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:13.4585820Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:13.4609468Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:13.4631724Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:13.4660036Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:13.4695232Z Entering 'third_party/pocketfft' 2025-12-04T11:11:13.4718666Z Entering 'third_party/protobuf' 2025-12-04T11:11:13.4751225Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:13.4773196Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:13.4803812Z Entering 'third_party/psimd' 2025-12-04T11:11:13.4826404Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:13.4848085Z Entering 'third_party/pybind11' 2025-12-04T11:11:13.4873111Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:13.4898930Z Entering 'third_party/sleef' 2025-12-04T11:11:13.4924464Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:13.4947355Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:13.4967315Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:13.4993145Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:13.5014585Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:13.5035761Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:13.5073841Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-12-04T11:11:13.5242483Z Entering 'android/libs/fbjni' 2025-12-04T11:11:13.5267698Z Entering 'third_party/FP16' 2025-12-04T11:11:13.5288672Z Entering 'third_party/FXdiv' 2025-12-04T11:11:13.5311431Z Entering 'third_party/NNPACK' 2025-12-04T11:11:13.5334335Z Entering 'third_party/NVTX' 2025-12-04T11:11:13.5355649Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:13.5375927Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:13.5403544Z Entering 'third_party/aiter' 2025-12-04T11:11:13.5426584Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:13.5451439Z Entering 'third_party/benchmark' 2025-12-04T11:11:13.5475041Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:13.5499902Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:13.5521621Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:13.5543441Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:13.5570253Z Entering 'third_party/cutlass' 2025-12-04T11:11:13.5598460Z Entering 'third_party/fbgemm' 2025-12-04T11:11:13.5619724Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:13.5640014Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:13.5668453Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:13.5688649Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:13.5713867Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:13.5733603Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:13.5753117Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:13.5775091Z Entering 'third_party/flash-attention' 2025-12-04T11:11:13.5798935Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:13.5821904Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:13.5847531Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:13.5871548Z Entering 'third_party/fmt' 2025-12-04T11:11:13.5893585Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:13.5916212Z Entering 'third_party/gloo' 2025-12-04T11:11:13.5939374Z Entering 'third_party/googletest' 2025-12-04T11:11:13.5964593Z Entering 'third_party/ideep' 2025-12-04T11:11:13.5985466Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:13.6012839Z Entering 'third_party/ittapi' 2025-12-04T11:11:13.6035607Z Entering 'third_party/kineto' 2025-12-04T11:11:13.6056390Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:13.6078241Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:13.6100185Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:13.6126329Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:13.6148387Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:13.6170422Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:13.6193800Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:13.6215581Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:13.6239405Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:13.6261909Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:13.6282934Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:13.6304064Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:13.6327169Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:13.6352647Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:13.6374880Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:13.6397868Z Entering 'third_party/kleidiai' 2025-12-04T11:11:13.6422816Z Entering 'third_party/mimalloc' 2025-12-04T11:11:13.6443098Z Entering 'third_party/nlohmann' 2025-12-04T11:11:13.6466501Z Entering 'third_party/onnx' 2025-12-04T11:11:13.6494219Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:13.6519937Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:13.6547287Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:13.6568066Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:13.6590650Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:13.6616735Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:13.6640977Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:13.6666428Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:13.6688896Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:13.6713113Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:13.6740539Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:13.6768947Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:13.6799872Z Entering 'third_party/pocketfft' 2025-12-04T11:11:13.6821502Z Entering 'third_party/protobuf' 2025-12-04T11:11:13.6843484Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:13.6868039Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:13.6891566Z Entering 'third_party/psimd' 2025-12-04T11:11:13.6918597Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:13.6941426Z Entering 'third_party/pybind11' 2025-12-04T11:11:13.6965340Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:13.6986477Z Entering 'third_party/sleef' 2025-12-04T11:11:13.7006881Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:13.7029145Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:13.7049702Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:13.7069612Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:13.7096827Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:13.7116360Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:13.7152573Z ##[endgroup] 2025-12-04T11:11:13.7319994Z [command]/usr/bin/git log -1 --format=%H 2025-12-04T11:11:13.7412437Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:11:13.7535987Z ##[group]Run actions/checkout@v4 2025-12-04T11:11:13.7536125Z with: 2025-12-04T11:11:13.7536233Z ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:11:13.7536367Z fetch-depth: 0 2025-12-04T11:11:13.7536468Z submodules: recursive 2025-12-04T11:11:13.7536569Z show-progress: false 2025-12-04T11:11:13.7536689Z repository: pytorch/pytorch 2025-12-04T11:11:13.7536845Z token: *** 2025-12-04T11:11:13.7536936Z ssh-strict: true 2025-12-04T11:11:13.7537035Z ssh-user: git 2025-12-04T11:11:13.7537134Z persist-credentials: true 2025-12-04T11:11:13.7537244Z clean: true 2025-12-04T11:11:13.7537347Z sparse-checkout-cone-mode: true 2025-12-04T11:11:13.7537476Z fetch-tags: false 2025-12-04T11:11:13.7537577Z lfs: false 2025-12-04T11:11:13.7537670Z set-safe-directory: true 2025-12-04T11:11:13.7537773Z env: 2025-12-04T11:11:13.7537868Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:13.7537985Z ##[endgroup] 2025-12-04T11:11:13.8004499Z Syncing repository: pytorch/pytorch 2025-12-04T11:11:13.8004819Z ##[group]Getting Git version info 2025-12-04T11:11:13.8005025Z Working directory is '/home/runner/_work/pytorch/pytorch' 2025-12-04T11:11:13.8019041Z [command]/usr/bin/git version 2025-12-04T11:11:13.8047492Z git version 2.52.0 2025-12-04T11:11:13.8063612Z ##[endgroup] 2025-12-04T11:11:13.8070036Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/3665f695-7bbc-4c6c-b9ba-d298500bfe04/.gitconfig' 2025-12-04T11:11:13.8076131Z Temporarily overriding HOME='/home/runner/_work/_temp/3665f695-7bbc-4c6c-b9ba-d298500bfe04' before making global git config changes 2025-12-04T11:11:13.8076669Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T11:11:13.8079556Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T11:11:13.8104292Z [command]/usr/bin/git config --local --get remote.origin.url 2025-12-04T11:11:13.8121026Z https://github.com/pytorch/pytorch 2025-12-04T11:11:13.8135932Z ##[group]Removing previously created refs, to avoid conflicts 2025-12-04T11:11:13.8139452Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-12-04T11:11:13.8154075Z HEAD 2025-12-04T11:11:13.8177910Z ##[endgroup] 2025-12-04T11:11:13.8179535Z [command]/usr/bin/git submodule status 2025-12-04T11:11:13.8386517Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-12-04T11:11:13.8434322Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-12-04T11:11:13.8486678Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-12-04T11:11:13.8539952Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-12-04T11:11:13.8571860Z 3ebbc93ded7285963bff932c678fa367eb393ba6 third_party/NVTX (v3.1.0-313-g3ebbc93) 2025-12-04T11:11:13.8624332Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-12-04T11:11:13.8906886Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-12-04T11:11:13.8931261Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-12-04T11:11:13.8949101Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-12-04T11:11:13.9002903Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-12-04T11:11:13.9083843Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-12-04T11:11:13.9163552Z f858c30bcb16f8effd5ff46996f0514539e17abc third_party/cpuinfo (f858c30) 2025-12-04T11:11:13.9184858Z 0b1577c8c83401237d601d0d0db5210506705396 third_party/cudnn_frontend (v0.5-61-g0b1577c) 2025-12-04T11:11:13.9252460Z f88806b1e31dfa579842638740216dd41fc6c588 third_party/cutlass (v4.3.1) 2025-12-04T11:11:13.9273396Z c0b988d39a9e47c794d699f29930ed4d7c7e13a4 third_party/fbgemm (v1.4.0-rc1-2-gc0b988d39) 2025-12-04T11:11:13.9337289Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-12-04T11:11:13.9353729Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-12-04T11:11:13.9672309Z 407c905e45ad75fc29bf0f9bb7c5c2fd3475976f third_party/fmt (12.1.0) 2025-12-04T11:11:13.9758831Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-12-04T11:11:13.9841606Z 54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0) 2025-12-04T11:11:14.0021628Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-12-04T11:11:14.0084261Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-12-04T11:11:14.0125389Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-12-04T11:11:14.0270228Z 31f85df8fbd89c188f14ef10f1ec65379786b943 third_party/kineto (heads/main) 2025-12-04T11:11:14.0287914Z d7770c89632329a9914ef1a90289917597639cbe third_party/kleidiai (v1.15.0) 2025-12-04T11:11:14.0303829Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-12-04T11:11:14.0320037Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-12-04T11:11:14.0582941Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-12-04T11:11:14.0603098Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-12-04T11:11:14.0622668Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-12-04T11:11:14.0896114Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-12-04T11:11:14.0966763Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-12-04T11:11:14.1010706Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-12-04T11:11:14.1032250Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-12-04T11:11:14.1082155Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-12-04T11:11:14.1146562Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-12-04T11:11:14.1189683Z 2b4cd91092d335a697416b2a3cb398283246849d third_party/tensorpipe (heads/main) 2025-12-04T11:11:14.1206080Z ##[group]Cleaning the repository 2025-12-04T11:11:14.1212014Z [command]/usr/bin/git clean -ffdx 2025-12-04T11:11:14.1368372Z [command]/usr/bin/git reset --hard HEAD 2025-12-04T11:11:14.2175025Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T11:11:14.2243892Z ##[endgroup] 2025-12-04T11:11:14.2246362Z ##[group]Disabling automatic garbage collection 2025-12-04T11:11:14.2251501Z [command]/usr/bin/git config --local gc.auto 0 2025-12-04T11:11:14.2280249Z ##[endgroup] 2025-12-04T11:11:14.2280497Z ##[group]Setting up auth 2025-12-04T11:11:14.2283724Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T11:11:14.2311014Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T11:11:14.2502846Z Entering 'android/libs/fbjni' 2025-12-04T11:11:14.2527475Z Entering 'third_party/FP16' 2025-12-04T11:11:14.2553853Z Entering 'third_party/FXdiv' 2025-12-04T11:11:14.2580065Z Entering 'third_party/NNPACK' 2025-12-04T11:11:14.2606645Z Entering 'third_party/NVTX' 2025-12-04T11:11:14.2630499Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:14.2656145Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:14.2686551Z Entering 'third_party/aiter' 2025-12-04T11:11:14.2717154Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:14.2742979Z Entering 'third_party/benchmark' 2025-12-04T11:11:14.2767063Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:14.2794645Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:14.2819519Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:14.2843927Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:14.2869283Z Entering 'third_party/cutlass' 2025-12-04T11:11:14.2898067Z Entering 'third_party/fbgemm' 2025-12-04T11:11:14.2924624Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:14.2976631Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:14.3021769Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:14.3054114Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:14.3079090Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:14.3105227Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:14.3120238Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:14.3165326Z Entering 'third_party/flash-attention' 2025-12-04T11:11:14.3197738Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:14.3216984Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:14.3240678Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:14.3265340Z Entering 'third_party/fmt' 2025-12-04T11:11:14.3292728Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:14.3311866Z Entering 'third_party/gloo' 2025-12-04T11:11:14.3339096Z Entering 'third_party/googletest' 2025-12-04T11:11:14.3365525Z Entering 'third_party/ideep' 2025-12-04T11:11:14.3391260Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:14.3412485Z Entering 'third_party/ittapi' 2025-12-04T11:11:14.3433979Z Entering 'third_party/kineto' 2025-12-04T11:11:14.3459608Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:14.3496254Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:14.3523669Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:14.3550794Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:14.3568821Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:14.3588926Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:14.3608539Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:14.3634822Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:14.3668306Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:14.3691797Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:14.3713293Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:14.3731467Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:14.3754662Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:14.3785124Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:14.3817429Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:14.3842083Z Entering 'third_party/kleidiai' 2025-12-04T11:11:14.3865843Z Entering 'third_party/mimalloc' 2025-12-04T11:11:14.3888970Z Entering 'third_party/nlohmann' 2025-12-04T11:11:14.3910873Z Entering 'third_party/onnx' 2025-12-04T11:11:14.3938643Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:14.3968283Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:14.3993532Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:14.4021352Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:14.4043427Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:14.4063519Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:14.4084036Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:14.4104842Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:14.4125549Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:14.4146049Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:14.4175918Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:14.4200111Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:14.4231462Z Entering 'third_party/pocketfft' 2025-12-04T11:11:14.4253438Z Entering 'third_party/protobuf' 2025-12-04T11:11:14.4277660Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:14.4302458Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:14.4331298Z Entering 'third_party/psimd' 2025-12-04T11:11:14.4353374Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:14.4375426Z Entering 'third_party/pybind11' 2025-12-04T11:11:14.4397099Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:14.4418319Z Entering 'third_party/sleef' 2025-12-04T11:11:14.4440177Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:14.4483132Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:14.4484845Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:14.4508070Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:14.4534962Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:14.4558866Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:14.4610400Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T11:11:14.4630995Z http.https://github.com/.extraheader 2025-12-04T11:11:14.4639822Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-12-04T11:11:14.4664465Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T11:11:14.4837611Z Entering 'android/libs/fbjni' 2025-12-04T11:11:14.4850692Z http.https://github.com/.extraheader 2025-12-04T11:11:14.4875567Z Entering 'third_party/FP16' 2025-12-04T11:11:14.4890492Z http.https://github.com/.extraheader 2025-12-04T11:11:14.4908196Z Entering 'third_party/FXdiv' 2025-12-04T11:11:14.4932279Z http.https://github.com/.extraheader 2025-12-04T11:11:14.4965924Z Entering 'third_party/NNPACK' 2025-12-04T11:11:14.4979714Z http.https://github.com/.extraheader 2025-12-04T11:11:14.4996626Z Entering 'third_party/NVTX' 2025-12-04T11:11:14.5010539Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5031095Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:14.5052955Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5070606Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:14.5084964Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5114359Z Entering 'third_party/aiter' 2025-12-04T11:11:14.5129253Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5149634Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:14.5169154Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5203059Z Entering 'third_party/benchmark' 2025-12-04T11:11:14.5214610Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5233929Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:14.5257378Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5281340Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:14.5302635Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5321326Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:14.5336695Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5361701Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:14.5382190Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5410186Z Entering 'third_party/cutlass' 2025-12-04T11:11:14.5425624Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5449284Z Entering 'third_party/fbgemm' 2025-12-04T11:11:14.5463623Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5485090Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:14.5499052Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5516706Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:14.5529863Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5549532Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:14.5560502Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5575988Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:14.5589319Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5611371Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:14.5623926Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5640360Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:14.5653085Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5668667Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:14.5680871Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5700318Z Entering 'third_party/flash-attention' 2025-12-04T11:11:14.5727498Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5748806Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:14.5762053Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5790640Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:14.5814483Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5845357Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:14.5861201Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5883233Z Entering 'third_party/fmt' 2025-12-04T11:11:14.5897612Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5916753Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:14.5932380Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5959224Z Entering 'third_party/gloo' 2025-12-04T11:11:14.5973275Z http.https://github.com/.extraheader 2025-12-04T11:11:14.5993051Z Entering 'third_party/googletest' 2025-12-04T11:11:14.6006790Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6027388Z Entering 'third_party/ideep' 2025-12-04T11:11:14.6040625Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6058926Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:14.6072119Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6093474Z Entering 'third_party/ittapi' 2025-12-04T11:11:14.6107312Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6129369Z Entering 'third_party/kineto' 2025-12-04T11:11:14.6142917Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6160579Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:14.6173076Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6188651Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:14.6201432Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6221788Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:14.6235102Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6251250Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:14.6262954Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6278629Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:14.6290596Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6306650Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:14.6318803Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6338805Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:14.6356457Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6375539Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:14.6387477Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6405576Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:14.6419246Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6438011Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:14.6450955Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6468749Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:14.6481472Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6497685Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:14.6510831Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6529886Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:14.6542578Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6562262Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:14.6588419Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6607949Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:14.6627488Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6654113Z Entering 'third_party/kleidiai' 2025-12-04T11:11:14.6671572Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6698254Z Entering 'third_party/mimalloc' 2025-12-04T11:11:14.6715120Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6738855Z Entering 'third_party/nlohmann' 2025-12-04T11:11:14.6759551Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6782747Z Entering 'third_party/onnx' 2025-12-04T11:11:14.6799172Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6832373Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:14.6850101Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6878132Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:14.6899037Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6924440Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:14.6944609Z http.https://github.com/.extraheader 2025-12-04T11:11:14.6968767Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:14.6986684Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7007955Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:14.7024257Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7044627Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:14.7060590Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7080305Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:14.7102674Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7120367Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:14.7134848Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7149819Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:14.7164038Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7180161Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:14.7195284Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7222227Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:14.7235655Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7256290Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:14.7269265Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7293521Z Entering 'third_party/pocketfft' 2025-12-04T11:11:14.7307916Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7326867Z Entering 'third_party/protobuf' 2025-12-04T11:11:14.7347312Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7370725Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:14.7384461Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7405825Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:14.7418543Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7436714Z Entering 'third_party/psimd' 2025-12-04T11:11:14.7454024Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7471504Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:14.7486040Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7503737Z Entering 'third_party/pybind11' 2025-12-04T11:11:14.7518947Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7536514Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:14.7549590Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7567020Z Entering 'third_party/sleef' 2025-12-04T11:11:14.7580564Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7597971Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:14.7612094Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7628844Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:14.7642749Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7659542Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:14.7671062Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7688454Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:14.7700372Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7716977Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:14.7730847Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7754977Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:14.7769214Z http.https://github.com/.extraheader 2025-12-04T11:11:14.7806337Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:14.7832697Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T11:11:14.8001127Z Entering 'android/libs/fbjni' 2025-12-04T11:11:14.8018601Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T11:11:14.8034568Z Entering 'third_party/FP16' 2025-12-04T11:11:14.8048478Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T11:11:14.8060191Z Entering 'third_party/FXdiv' 2025-12-04T11:11:14.8074032Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T11:11:14.8084328Z Entering 'third_party/NNPACK' 2025-12-04T11:11:14.8094478Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T11:11:14.8105275Z Entering 'third_party/NVTX' 2025-12-04T11:11:14.8117075Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T11:11:14.8126868Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:14.8137567Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T11:11:14.8158449Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:14.8169671Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T11:11:14.8187119Z Entering 'third_party/aiter' 2025-12-04T11:11:14.8199452Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T11:11:14.8210031Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:14.8216463Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T11:11:14.8234977Z Entering 'third_party/benchmark' 2025-12-04T11:11:14.8247429Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:14.8258829Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:14.8269847Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T11:11:14.8286119Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:14.8296548Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T11:11:14.8315681Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:14.8328010Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T11:11:14.8338636Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:14.8357533Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T11:11:14.8376006Z Entering 'third_party/cutlass' 2025-12-04T11:11:14.8386120Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T11:11:14.8413273Z Entering 'third_party/fbgemm' 2025-12-04T11:11:14.8427021Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T11:11:14.8436575Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:14.8446226Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T11:11:14.8454484Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:14.8464811Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T11:11:14.8476842Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:14.8485157Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T11:11:14.8492040Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:14.8500414Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T11:11:14.8510214Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:14.8519357Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T11:11:14.8528296Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:14.8534536Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T11:11:14.8540828Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:14.8548748Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T11:11:14.8559145Z Entering 'third_party/flash-attention' 2025-12-04T11:11:14.8568632Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T11:11:14.8576917Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:14.8598360Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T11:11:14.8616493Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:14.8631582Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T11:11:14.8650686Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:14.8666761Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T11:11:14.8677472Z Entering 'third_party/fmt' 2025-12-04T11:11:14.8688958Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T11:11:14.8699325Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:14.8714798Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T11:11:14.8729348Z Entering 'third_party/gloo' 2025-12-04T11:11:14.8741561Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T11:11:14.8756765Z Entering 'third_party/googletest' 2025-12-04T11:11:14.8766817Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:14.8776844Z Entering 'third_party/ideep' 2025-12-04T11:11:14.8786460Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T11:11:14.8798673Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:14.8809228Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T11:11:14.8821802Z Entering 'third_party/ittapi' 2025-12-04T11:11:14.8833991Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T11:11:14.8843021Z Entering 'third_party/kineto' 2025-12-04T11:11:14.8852351Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T11:11:14.8866182Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:14.8873530Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T11:11:14.8880736Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:14.8913792Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T11:11:14.8924781Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:14.8936163Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T11:11:14.8942931Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:14.8952862Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T11:11:14.8975728Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:14.8986279Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T11:11:14.8991868Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:14.9003311Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T11:11:14.9011135Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:14.9020547Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T11:11:14.9027774Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:14.9036148Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:14.9042713Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:14.9051184Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T11:11:14.9057684Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:14.9064971Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T11:11:14.9072834Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:14.9080607Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T11:11:14.9088320Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:14.9096815Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T11:11:14.9104625Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:14.9113791Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T11:11:14.9125165Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:14.9134854Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T11:11:14.9142518Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:14.9153270Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T11:11:14.9164065Z Entering 'third_party/kleidiai' 2025-12-04T11:11:14.9176013Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T11:11:14.9187668Z Entering 'third_party/mimalloc' 2025-12-04T11:11:14.9198264Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T11:11:14.9209716Z Entering 'third_party/nlohmann' 2025-12-04T11:11:14.9220046Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T11:11:14.9231453Z Entering 'third_party/onnx' 2025-12-04T11:11:14.9241796Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T11:11:14.9259387Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:14.9268842Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:14.9279555Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:14.9290759Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T11:11:14.9301251Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:14.9310224Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:14.9317750Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:14.9326613Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:14.9334385Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:14.9352167Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T11:11:14.9366998Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:14.9379828Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T11:11:14.9389423Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:14.9398816Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T11:11:14.9408732Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:14.9418423Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T11:11:14.9425589Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:14.9436055Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T11:11:14.9443611Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:14.9452504Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T11:11:14.9461183Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:14.9471369Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T11:11:14.9481036Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:14.9489833Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T11:11:14.9509317Z Entering 'third_party/pocketfft' 2025-12-04T11:11:14.9520692Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T11:11:14.9531972Z Entering 'third_party/protobuf' 2025-12-04T11:11:14.9542558Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T11:11:14.9553628Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:14.9562981Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:14.9570785Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:14.9581639Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:14.9592138Z Entering 'third_party/psimd' 2025-12-04T11:11:14.9602818Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T11:11:14.9614807Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:14.9624972Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T11:11:14.9635506Z Entering 'third_party/pybind11' 2025-12-04T11:11:14.9645802Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:14.9656351Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:14.9674724Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T11:11:14.9685643Z Entering 'third_party/sleef' 2025-12-04T11:11:14.9696882Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T11:11:14.9707693Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:14.9718904Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T11:11:14.9729398Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:14.9738942Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:14.9754892Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:14.9772030Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T11:11:14.9782102Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:14.9791479Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T11:11:14.9799937Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:14.9815054Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:14.9822949Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:14.9833947Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T11:11:14.9869143Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:14.9895222Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:14.9914101Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:14.9931420Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:14.9947764Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:14.9964436Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:14.9980651Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:14.9997377Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0012485Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0028547Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0043334Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0059954Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0083678Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0099923Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0114757Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0130754Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0144183Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0159874Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0176157Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0193922Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0208596Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0239227Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0261106Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0277422Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0296332Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0311701Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0326626Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0341806Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0359110Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0374122Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0390301Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0404949Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0420027Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0437482Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0453210Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0479562Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0499396Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0516077Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0533156Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0548638Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0564426Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0581317Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0599242Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0614430Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0628909Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0643919Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0658339Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0672813Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0686644Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0701563Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0716112Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0730314Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0744182Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0758573Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0786305Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0814110Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0843301Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0872448Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0893509Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0915508Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0935088Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0954509Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0974814Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.0993558Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1011710Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1036493Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1056118Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1076309Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1096270Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1116102Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1135220Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1154690Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1173949Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1193077Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1211051Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1231501Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1249108Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1268770Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1288455Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1308027Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1327976Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T11:11:15.1350612Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T11:11:15.1589412Z ##[endgroup] 2025-12-04T11:11:15.1589870Z ##[group]Fetching the repository 2025-12-04T11:11:15.1594374Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-12-04T11:11:16.8122301Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object} 2025-12-04T11:11:16.8261687Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:11:16.8265467Z ##[endgroup] 2025-12-04T11:11:16.8265831Z ##[group]Determining the checkout info 2025-12-04T11:11:16.8266700Z ##[endgroup] 2025-12-04T11:11:16.8271709Z [command]/usr/bin/git sparse-checkout disable 2025-12-04T11:11:16.8405773Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-12-04T11:11:16.8430385Z ##[group]Checking out the ref 2025-12-04T11:11:16.8434309Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:11:16.8790624Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T11:11:16.8799138Z ##[endgroup] 2025-12-04T11:11:16.8799532Z ##[group]Setting up auth for fetching submodules 2025-12-04T11:11:16.8805133Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T11:11:16.8851238Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-12-04T11:11:16.8874825Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-12-04T11:11:16.8903951Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-12-04T11:11:16.8921919Z ##[endgroup] 2025-12-04T11:11:16.8922163Z ##[group]Fetching submodules 2025-12-04T11:11:16.8924386Z [command]/usr/bin/git submodule sync --recursive 2025-12-04T11:11:16.9112738Z Synchronizing submodule url for 'android/libs/fbjni' 2025-12-04T11:11:16.9123368Z Synchronizing submodule url for 'third_party/FP16' 2025-12-04T11:11:16.9135045Z Synchronizing submodule url for 'third_party/FXdiv' 2025-12-04T11:11:16.9146044Z Synchronizing submodule url for 'third_party/NNPACK' 2025-12-04T11:11:16.9157189Z Synchronizing submodule url for 'third_party/NVTX' 2025-12-04T11:11:16.9168203Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:16.9179083Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-12-04T11:11:16.9195589Z Synchronizing submodule url for 'third_party/aiter' 2025-12-04T11:11:16.9208636Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:16.9223178Z Synchronizing submodule url for 'third_party/benchmark' 2025-12-04T11:11:16.9234246Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-12-04T11:11:16.9247424Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-12-04T11:11:16.9258303Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-12-04T11:11:16.9269772Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-12-04T11:11:16.9280627Z Synchronizing submodule url for 'third_party/cutlass' 2025-12-04T11:11:16.9294070Z Synchronizing submodule url for 'third_party/fbgemm' 2025-12-04T11:11:16.9305071Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:16.9315370Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:16.9330551Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:16.9341190Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:16.9353786Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:16.9364296Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:16.9380823Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-12-04T11:11:16.9394172Z Synchronizing submodule url for 'third_party/flash-attention' 2025-12-04T11:11:16.9405796Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:16.9420263Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:16.9436234Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-12-04T11:11:16.9449024Z Synchronizing submodule url for 'third_party/fmt' 2025-12-04T11:11:16.9460325Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:16.9470254Z Synchronizing submodule url for 'third_party/gloo' 2025-12-04T11:11:16.9481126Z Synchronizing submodule url for 'third_party/googletest' 2025-12-04T11:11:16.9490645Z Synchronizing submodule url for 'third_party/ideep' 2025-12-04T11:11:16.9503963Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:16.9515698Z Synchronizing submodule url for 'third_party/ittapi' 2025-12-04T11:11:16.9526881Z Synchronizing submodule url for 'third_party/kineto' 2025-12-04T11:11:16.9537484Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:16.9551980Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:16.9564413Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:16.9580005Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:16.9591135Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:16.9601885Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:16.9621612Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:16.9639961Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:16.9651583Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:16.9661374Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:16.9670873Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:16.9684324Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:16.9703905Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:16.9719134Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:16.9731317Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:16.9744070Z Synchronizing submodule url for 'third_party/kleidiai' 2025-12-04T11:11:16.9758616Z Synchronizing submodule url for 'third_party/mimalloc' 2025-12-04T11:11:16.9770048Z Synchronizing submodule url for 'third_party/nlohmann' 2025-12-04T11:11:16.9780532Z Synchronizing submodule url for 'third_party/onnx' 2025-12-04T11:11:16.9799238Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:16.9816156Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-12-04T11:11:16.9829072Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:16.9846359Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:16.9857591Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:16.9867719Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:16.9879583Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:16.9890166Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:16.9900122Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:16.9913407Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:16.9923275Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:16.9944190Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:16.9962197Z Synchronizing submodule url for 'third_party/pocketfft' 2025-12-04T11:11:16.9973302Z Synchronizing submodule url for 'third_party/protobuf' 2025-12-04T11:11:16.9993843Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:17.0009319Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:17.0036270Z Synchronizing submodule url for 'third_party/psimd' 2025-12-04T11:11:17.0051559Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-12-04T11:11:17.0064702Z Synchronizing submodule url for 'third_party/pybind11' 2025-12-04T11:11:17.0098132Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-12-04T11:11:17.0124286Z Synchronizing submodule url for 'third_party/sleef' 2025-12-04T11:11:17.0146320Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-12-04T11:11:17.0163494Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:17.0174836Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:17.0184363Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:17.0194113Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:17.0205781Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:17.0249406Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-12-04T11:11:17.0446090Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-12-04T11:11:17.0498419Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-12-04T11:11:17.0540176Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-12-04T11:11:17.0601404Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-12-04T11:11:17.0682260Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6' 2025-12-04T11:11:17.0747482Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-12-04T11:11:17.0904479Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-12-04T11:11:17.1120336Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-12-04T11:11:17.1334952Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-12-04T11:11:17.1411819Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-12-04T11:11:17.1623100Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T11:11:17.1700862Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-12-04T11:11:17.1755731Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc' 2025-12-04T11:11:17.1837004Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396' 2025-12-04T11:11:17.1942159Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588' 2025-12-04T11:11:17.2070666Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4' 2025-12-04T11:11:17.2128211Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-12-04T11:11:17.2398448Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T11:11:17.2469046Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-12-04T11:11:17.2605804Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8' 2025-12-04T11:11:17.2683270Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T11:11:17.2738409Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-12-04T11:11:17.2840116Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-12-04T11:11:17.2921333Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-12-04T11:11:17.3101252Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-12-04T11:11:17.3220209Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-12-04T11:11:17.3316905Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-12-04T11:11:17.3371620Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f' 2025-12-04T11:11:17.3439931Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-12-04T11:11:17.3497399Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-12-04T11:11:17.3556862Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T11:11:17.3614962Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-12-04T11:11:17.3826853Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-12-04T11:11:17.3885630Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-12-04T11:11:17.3945555Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943' 2025-12-04T11:11:17.4034069Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-12-04T11:11:17.4107214Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-12-04T11:11:17.4162040Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-12-04T11:11:17.4209919Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-12-04T11:11:17.4262248Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-12-04T11:11:17.4331533Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-12-04T11:11:17.4394188Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-12-04T11:11:17.4444273Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T11:11:17.4531024Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-12-04T11:11:17.4574072Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-12-04T11:11:17.4640595Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-12-04T11:11:17.4730636Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-12-04T11:11:17.4799385Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T11:11:17.4864213Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-12-04T11:11:17.4919032Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T11:11:17.4997768Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe' 2025-12-04T11:11:17.5074210Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-12-04T11:11:17.5165049Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-12-04T11:11:17.5335050Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-12-04T11:11:17.5411182Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-12-04T11:11:17.5497851Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-12-04T11:11:17.5551256Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-12-04T11:11:17.5602464Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-12-04T11:11:17.5650248Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-12-04T11:11:17.5730547Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-12-04T11:11:17.5787472Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-12-04T11:11:17.5833799Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-12-04T11:11:17.5903961Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-12-04T11:11:17.5983584Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-12-04T11:11:17.6072711Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T11:11:17.6229919Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-12-04T11:11:17.6309437Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-12-04T11:11:17.6505068Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-12-04T11:11:17.6595768Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-12-04T11:11:17.6653294Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-12-04T11:11:17.6735389Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-12-04T11:11:17.6792760Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-12-04T11:11:17.6869870Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-12-04T11:11:17.6919963Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-12-04T11:11:17.6982493Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-12-04T11:11:17.7053589Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d' 2025-12-04T11:11:17.7121476Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-12-04T11:11:17.7180909Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-12-04T11:11:17.7358508Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-12-04T11:11:17.7449933Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-12-04T11:11:17.7500849Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-12-04T11:11:17.7549160Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-12-04T11:11:17.7715631Z Entering 'android/libs/fbjni' 2025-12-04T11:11:17.7739792Z Entering 'third_party/FP16' 2025-12-04T11:11:17.7764717Z Entering 'third_party/FXdiv' 2025-12-04T11:11:17.7788905Z Entering 'third_party/NNPACK' 2025-12-04T11:11:17.7812711Z Entering 'third_party/NVTX' 2025-12-04T11:11:17.7853024Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:17.7895087Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:17.7934401Z Entering 'third_party/aiter' 2025-12-04T11:11:17.7980568Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:17.8018426Z Entering 'third_party/benchmark' 2025-12-04T11:11:17.8047639Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:17.8093673Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:17.8120128Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:17.8155152Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:17.8188661Z Entering 'third_party/cutlass' 2025-12-04T11:11:17.8217286Z Entering 'third_party/fbgemm' 2025-12-04T11:11:17.8253623Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:17.8277423Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:17.8321120Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:17.8353180Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:17.8388671Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:17.8418990Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:17.8444743Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:17.8476056Z Entering 'third_party/flash-attention' 2025-12-04T11:11:17.8503702Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:17.8531164Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:17.8556057Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:17.8591339Z Entering 'third_party/fmt' 2025-12-04T11:11:17.8614565Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:17.8635780Z Entering 'third_party/gloo' 2025-12-04T11:11:17.8657356Z Entering 'third_party/googletest' 2025-12-04T11:11:17.8677476Z Entering 'third_party/ideep' 2025-12-04T11:11:17.8699962Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:17.8722925Z Entering 'third_party/ittapi' 2025-12-04T11:11:17.8742502Z Entering 'third_party/kineto' 2025-12-04T11:11:17.8761636Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:17.8795534Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:17.8822374Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:17.8844019Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:17.8871759Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:17.8917858Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:17.8965204Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:17.9002630Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:17.9025885Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:17.9063167Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:17.9085280Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:17.9111681Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:17.9156684Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:17.9192886Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:17.9221031Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:17.9256929Z Entering 'third_party/kleidiai' 2025-12-04T11:11:17.9286519Z Entering 'third_party/mimalloc' 2025-12-04T11:11:17.9315266Z Entering 'third_party/nlohmann' 2025-12-04T11:11:17.9336805Z Entering 'third_party/onnx' 2025-12-04T11:11:17.9381534Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:17.9426752Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:17.9458228Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:17.9493572Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:17.9523706Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:17.9557137Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:17.9580831Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:17.9618902Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:17.9647628Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:17.9675181Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:17.9704080Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:17.9736664Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:17.9773519Z Entering 'third_party/pocketfft' 2025-12-04T11:11:17.9803352Z Entering 'third_party/protobuf' 2025-12-04T11:11:17.9827019Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:17.9849918Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:17.9871808Z Entering 'third_party/psimd' 2025-12-04T11:11:17.9900134Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:17.9921701Z Entering 'third_party/pybind11' 2025-12-04T11:11:17.9947223Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:17.9967181Z Entering 'third_party/sleef' 2025-12-04T11:11:17.9988893Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:18.0010067Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:18.0029240Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:18.0047837Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:18.0067194Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:18.0089331Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:18.0122308Z ##[endgroup] 2025-12-04T11:11:18.0122576Z ##[group]Persisting credentials for submodules 2025-12-04T11:11:18.0129225Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-12-04T11:11:18.0280292Z Entering 'android/libs/fbjni' 2025-12-04T11:11:18.0294138Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0294331Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0311481Z Entering 'third_party/FP16' 2025-12-04T11:11:18.0323551Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0323739Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0338392Z Entering 'third_party/FXdiv' 2025-12-04T11:11:18.0352090Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0352294Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0367793Z Entering 'third_party/NNPACK' 2025-12-04T11:11:18.0380615Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0380804Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0397669Z Entering 'third_party/NVTX' 2025-12-04T11:11:18.0409886Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0410079Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0425275Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:18.0441989Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0442127Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0466881Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:18.0480358Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0480644Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0513444Z Entering 'third_party/aiter' 2025-12-04T11:11:18.0526430Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0526571Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0557919Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:18.0570910Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0571042Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0600655Z Entering 'third_party/benchmark' 2025-12-04T11:11:18.0614582Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0614811Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0634189Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:18.0647488Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0647628Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0671963Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:18.0684164Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0684285Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0704687Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:18.0723021Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0723260Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0742053Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:18.0763451Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0763587Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0787557Z Entering 'third_party/cutlass' 2025-12-04T11:11:18.0802144Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0802280Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0827734Z Entering 'third_party/fbgemm' 2025-12-04T11:11:18.0842217Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0842568Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0861464Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:18.0875624Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0875754Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0901165Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:18.0915381Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0915510Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0939760Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:18.0953610Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0953744Z url.https://github.com/.insteadof 2025-12-04T11:11:18.0973618Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:18.0988913Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1007646Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1007790Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:18.1020723Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1020862Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1043282Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:18.1057103Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1057256Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1073833Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:18.1086456Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1086587Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1106750Z Entering 'third_party/flash-attention' 2025-12-04T11:11:18.1122258Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1122375Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1143583Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:18.1157657Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1158058Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1177210Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:18.1193711Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1193858Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1216472Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:18.1234551Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1234694Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1254199Z Entering 'third_party/fmt' 2025-12-04T11:11:18.1268010Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1268217Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1285201Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:18.1298936Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1299071Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1316848Z Entering 'third_party/gloo' 2025-12-04T11:11:18.1331918Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1332065Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1348964Z Entering 'third_party/googletest' 2025-12-04T11:11:18.1363654Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1363800Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1381260Z Entering 'third_party/ideep' 2025-12-04T11:11:18.1395444Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1395596Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1412373Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:18.1425909Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1426061Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1448355Z Entering 'third_party/ittapi' 2025-12-04T11:11:18.1462608Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1462759Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1480848Z Entering 'third_party/kineto' 2025-12-04T11:11:18.1494492Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1494637Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1512868Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:18.1529377Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1529764Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1549452Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:18.1563290Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1563540Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1582340Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:18.1597380Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1597528Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1615833Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:18.1629947Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1630095Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1648419Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:18.1663013Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1663147Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1682394Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:18.1694927Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1695061Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1727589Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:18.1746409Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1746561Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1763669Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:18.1784012Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1784224Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1805569Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:18.1822301Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1822450Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1840203Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:18.1855737Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1855895Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1873293Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:18.1886321Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1886475Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1904028Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:18.1918951Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1940698Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1940950Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:18.1957446Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1957600Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1978671Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:18.1993362Z url.https://github.com/.insteadof 2025-12-04T11:11:18.1993513Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2013552Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:18.2027043Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2027196Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2047827Z Entering 'third_party/kleidiai' 2025-12-04T11:11:18.2063114Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2063266Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2081331Z Entering 'third_party/mimalloc' 2025-12-04T11:11:18.2095839Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2095998Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2113830Z Entering 'third_party/nlohmann' 2025-12-04T11:11:18.2128025Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2128222Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2146897Z Entering 'third_party/onnx' 2025-12-04T11:11:18.2161444Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2161598Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2186034Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:18.2199766Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2199923Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2220842Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:18.2234574Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2234728Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2261529Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:18.2274832Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2274981Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2294810Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:18.2310196Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2310345Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2331374Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:18.2345646Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2345812Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2364411Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:18.2385921Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2386073Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2403083Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:18.2420882Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2421027Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2439141Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:18.2454121Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2454271Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2476833Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:18.2493385Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2493532Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2516608Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:18.2534881Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2535119Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2555129Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:18.2570016Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2570163Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2590639Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:18.2609244Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2609403Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2635781Z Entering 'third_party/pocketfft' 2025-12-04T11:11:18.2650842Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2650994Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2668002Z Entering 'third_party/protobuf' 2025-12-04T11:11:18.2683188Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2683335Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2703291Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:18.2717490Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2717644Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2736377Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:18.2749924Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2750080Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2771214Z Entering 'third_party/psimd' 2025-12-04T11:11:18.2790239Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2790389Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2809261Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:18.2824179Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2824329Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2843018Z Entering 'third_party/pybind11' 2025-12-04T11:11:18.2857633Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2857899Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2875980Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:18.2890490Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2890810Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2908349Z Entering 'third_party/sleef' 2025-12-04T11:11:18.2924061Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2924372Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2942024Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:18.2956659Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2956846Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2973993Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:18.2987837Z url.https://github.com/.insteadof 2025-12-04T11:11:18.2987969Z url.https://github.com/.insteadof 2025-12-04T11:11:18.3008063Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:18.3021759Z url.https://github.com/.insteadof 2025-12-04T11:11:18.3021881Z url.https://github.com/.insteadof 2025-12-04T11:11:18.3039500Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:18.3053842Z url.https://github.com/.insteadof 2025-12-04T11:11:18.3053975Z url.https://github.com/.insteadof 2025-12-04T11:11:18.3073161Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:18.3089793Z url.https://github.com/.insteadof 2025-12-04T11:11:18.3090086Z url.https://github.com/.insteadof 2025-12-04T11:11:18.3107388Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:18.3124034Z url.https://github.com/.insteadof 2025-12-04T11:11:18.3124173Z url.https://github.com/.insteadof 2025-12-04T11:11:18.3161595Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-12-04T11:11:18.3329477Z Entering 'android/libs/fbjni' 2025-12-04T11:11:18.3356046Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T11:11:18.3367573Z Entering 'third_party/FP16' 2025-12-04T11:11:18.3391774Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T11:11:18.3402952Z Entering 'third_party/FXdiv' 2025-12-04T11:11:18.3426956Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T11:11:18.3437917Z Entering 'third_party/NNPACK' 2025-12-04T11:11:18.3460416Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T11:11:18.3473899Z Entering 'third_party/NVTX' 2025-12-04T11:11:18.3498429Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T11:11:18.3510143Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:18.3533969Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T11:11:18.3545117Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:18.3567191Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T11:11:18.3584034Z Entering 'third_party/aiter' 2025-12-04T11:11:18.3605887Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T11:11:18.3617625Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:18.3639285Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T11:11:18.3654705Z Entering 'third_party/benchmark' 2025-12-04T11:11:18.3676950Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:18.3689377Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:18.3711584Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T11:11:18.3725782Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:18.3747944Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T11:11:18.3759249Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:18.3783109Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T11:11:18.3794236Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:18.3815525Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T11:11:18.3826507Z Entering 'third_party/cutlass' 2025-12-04T11:11:18.3848492Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T11:11:18.3863458Z Entering 'third_party/fbgemm' 2025-12-04T11:11:18.3884889Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T11:11:18.3897945Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:18.3927850Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T11:11:18.3937936Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:18.3959586Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T11:11:18.3972608Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:18.3994720Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T11:11:18.4005008Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:18.4025894Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T11:11:18.4040202Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:18.4061125Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T11:11:18.4071108Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:18.4091916Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T11:11:18.4101638Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:18.4123138Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T11:11:18.4135300Z Entering 'third_party/flash-attention' 2025-12-04T11:11:18.4158843Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T11:11:18.4170559Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:18.4192197Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T11:11:18.4204940Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:18.4231989Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T11:11:18.4247340Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:18.4271030Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T11:11:18.4283615Z Entering 'third_party/fmt' 2025-12-04T11:11:18.4304955Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T11:11:18.4317487Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:18.4339766Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T11:11:18.4351356Z Entering 'third_party/gloo' 2025-12-04T11:11:18.4373003Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T11:11:18.4384016Z Entering 'third_party/googletest' 2025-12-04T11:11:18.4404989Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:18.4416795Z Entering 'third_party/ideep' 2025-12-04T11:11:18.4438472Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T11:11:18.4448826Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:18.4469707Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T11:11:18.4484408Z Entering 'third_party/ittapi' 2025-12-04T11:11:18.4505949Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T11:11:18.4516915Z Entering 'third_party/kineto' 2025-12-04T11:11:18.4538614Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T11:11:18.4549611Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:18.4570271Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T11:11:18.4582824Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:18.4606860Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T11:11:18.4618317Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:18.4640439Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T11:11:18.4650586Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:18.4683981Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T11:11:18.4694445Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:18.4719324Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T11:11:18.4729017Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:18.4750169Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T11:11:18.4762022Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:18.4784115Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T11:11:18.4794221Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:18.4818088Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:18.4827959Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:18.4850899Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T11:11:18.4861159Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:18.4883620Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T11:11:18.4892484Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:18.4918441Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T11:11:18.4932947Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:18.4955531Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T11:11:18.4966386Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:18.4992940Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T11:11:18.5008338Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:18.5031430Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T11:11:18.5041558Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:18.5063810Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T11:11:18.5075377Z Entering 'third_party/kleidiai' 2025-12-04T11:11:18.5096373Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T11:11:18.5108747Z Entering 'third_party/mimalloc' 2025-12-04T11:11:18.5131600Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T11:11:18.5144521Z Entering 'third_party/nlohmann' 2025-12-04T11:11:18.5167792Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T11:11:18.5180055Z Entering 'third_party/onnx' 2025-12-04T11:11:18.5204562Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T11:11:18.5220785Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:18.5246282Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:18.5260374Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:18.5281502Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T11:11:18.5296847Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:18.5322869Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:18.5333753Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:18.5358883Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:18.5368899Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:18.5397652Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T11:11:18.5408229Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:18.5433964Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T11:11:18.5444718Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:18.5481076Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T11:11:18.5494816Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:18.5517435Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T11:11:18.5528537Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:18.5555915Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T11:11:18.5567658Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:18.5590297Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T11:11:18.5604198Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:18.5626197Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T11:11:18.5638052Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:18.5661019Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T11:11:18.5684455Z Entering 'third_party/pocketfft' 2025-12-04T11:11:18.5710997Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T11:11:18.5722966Z Entering 'third_party/protobuf' 2025-12-04T11:11:18.5749864Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T11:11:18.5762818Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:18.5797474Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T11:11:18.5809189Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:18.5830525Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:18.5844518Z Entering 'third_party/psimd' 2025-12-04T11:11:18.5867348Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T11:11:18.5879222Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:18.5901830Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T11:11:18.5913134Z Entering 'third_party/pybind11' 2025-12-04T11:11:18.5936774Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:18.5948551Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:18.5972288Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T11:11:18.5983095Z Entering 'third_party/sleef' 2025-12-04T11:11:18.6005715Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T11:11:18.6016646Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:18.6040914Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T11:11:18.6051887Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:18.6075752Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T11:11:18.6094063Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:18.6121262Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T11:11:18.6132550Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:18.6167801Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T11:11:18.6179168Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:18.6200213Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T11:11:18.6212623Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:18.6234320Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T11:11:18.6512381Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-12-04T11:11:18.6692760Z Entering 'android/libs/fbjni' 2025-12-04T11:11:18.6719344Z Entering 'third_party/FP16' 2025-12-04T11:11:18.6748280Z Entering 'third_party/FXdiv' 2025-12-04T11:11:18.6770224Z Entering 'third_party/NNPACK' 2025-12-04T11:11:18.6796619Z Entering 'third_party/NVTX' 2025-12-04T11:11:18.6820880Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:18.6844876Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:18.6873889Z Entering 'third_party/aiter' 2025-12-04T11:11:18.6909912Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:18.6944341Z Entering 'third_party/benchmark' 2025-12-04T11:11:18.6971789Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:18.7000746Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:18.7031106Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:18.7061511Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:18.7085870Z Entering 'third_party/cutlass' 2025-12-04T11:11:18.7116013Z Entering 'third_party/fbgemm' 2025-12-04T11:11:18.7138999Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:18.7159007Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:18.7189467Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:18.7210634Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:18.7238073Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:18.7265333Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:18.7300481Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:18.7326644Z Entering 'third_party/flash-attention' 2025-12-04T11:11:18.7353501Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:18.7380445Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:18.7413156Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:18.7442889Z Entering 'third_party/fmt' 2025-12-04T11:11:18.7465134Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:18.7488357Z Entering 'third_party/gloo' 2025-12-04T11:11:18.7515103Z Entering 'third_party/googletest' 2025-12-04T11:11:18.7537353Z Entering 'third_party/ideep' 2025-12-04T11:11:18.7560963Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:18.7582942Z Entering 'third_party/ittapi' 2025-12-04T11:11:18.7608031Z Entering 'third_party/kineto' 2025-12-04T11:11:18.7631961Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:18.7653855Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:18.7677698Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:18.7705841Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:18.7735247Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:18.7758324Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:18.7786242Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:18.7807630Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:18.7839613Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:18.7868799Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:18.7890209Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:18.7914283Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:18.7937725Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:18.7970726Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:18.7997740Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:18.8033328Z Entering 'third_party/kleidiai' 2025-12-04T11:11:18.8059532Z Entering 'third_party/mimalloc' 2025-12-04T11:11:18.8082410Z Entering 'third_party/nlohmann' 2025-12-04T11:11:18.8106057Z Entering 'third_party/onnx' 2025-12-04T11:11:18.8135445Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:18.8159820Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:18.8185339Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:18.8208465Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:18.8237944Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:18.8269636Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:18.8296200Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:18.8329529Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:18.8352089Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:18.8378965Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:18.8401357Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:18.8434159Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:18.8467090Z Entering 'third_party/pocketfft' 2025-12-04T11:11:18.8493139Z Entering 'third_party/protobuf' 2025-12-04T11:11:18.8525729Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:18.8557504Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:18.8590230Z Entering 'third_party/psimd' 2025-12-04T11:11:18.8615061Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:18.8638463Z Entering 'third_party/pybind11' 2025-12-04T11:11:18.8669071Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:18.8693495Z Entering 'third_party/sleef' 2025-12-04T11:11:18.8722366Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:18.8745592Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:18.8771782Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:18.8795824Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:18.8828468Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:18.8853217Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:18.8893587Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-12-04T11:11:18.9079833Z Entering 'android/libs/fbjni' 2025-12-04T11:11:18.9100196Z Entering 'third_party/FP16' 2025-12-04T11:11:18.9130082Z Entering 'third_party/FXdiv' 2025-12-04T11:11:18.9153125Z Entering 'third_party/NNPACK' 2025-12-04T11:11:18.9183262Z Entering 'third_party/NVTX' 2025-12-04T11:11:18.9207956Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T11:11:18.9228501Z Entering 'third_party/XNNPACK' 2025-12-04T11:11:18.9257388Z Entering 'third_party/aiter' 2025-12-04T11:11:18.9283263Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T11:11:18.9314470Z Entering 'third_party/benchmark' 2025-12-04T11:11:18.9339347Z Entering 'third_party/composable_kernel' 2025-12-04T11:11:18.9367004Z Entering 'third_party/cpp-httplib' 2025-12-04T11:11:18.9398439Z Entering 'third_party/cpuinfo' 2025-12-04T11:11:18.9424354Z Entering 'third_party/cudnn_frontend' 2025-12-04T11:11:18.9451229Z Entering 'third_party/cutlass' 2025-12-04T11:11:18.9479376Z Entering 'third_party/fbgemm' 2025-12-04T11:11:18.9504768Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T11:11:18.9531336Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T11:11:18.9564271Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T11:11:18.9591660Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T11:11:18.9618475Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T11:11:18.9641537Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T11:11:18.9662630Z Entering 'third_party/fbgemm/external/json' 2025-12-04T11:11:18.9693235Z Entering 'third_party/flash-attention' 2025-12-04T11:11:18.9715669Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T11:11:18.9740958Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T11:11:18.9773025Z Entering 'third_party/flatbuffers' 2025-12-04T11:11:18.9796578Z Entering 'third_party/fmt' 2025-12-04T11:11:18.9818345Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T11:11:18.9840553Z Entering 'third_party/gloo' 2025-12-04T11:11:18.9863698Z Entering 'third_party/googletest' 2025-12-04T11:11:18.9885563Z Entering 'third_party/ideep' 2025-12-04T11:11:18.9908566Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T11:11:18.9938227Z Entering 'third_party/ittapi' 2025-12-04T11:11:18.9969783Z Entering 'third_party/kineto' 2025-12-04T11:11:18.9993244Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T11:11:19.0018418Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T11:11:19.0044739Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T11:11:19.0072654Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T11:11:19.0098636Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T11:11:19.0123128Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T11:11:19.0146698Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T11:11:19.0167558Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T11:11:19.0188788Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T11:11:19.0208917Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T11:11:19.0228392Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T11:11:19.0254563Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:19.0280847Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:19.0307452Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T11:11:19.0326795Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T11:11:19.0356767Z Entering 'third_party/kleidiai' 2025-12-04T11:11:19.0378906Z Entering 'third_party/mimalloc' 2025-12-04T11:11:19.0409234Z Entering 'third_party/nlohmann' 2025-12-04T11:11:19.0438555Z Entering 'third_party/onnx' 2025-12-04T11:11:19.0475787Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T11:11:19.0499864Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T11:11:19.0520245Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T11:11:19.0540118Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T11:11:19.0567079Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T11:11:19.0594604Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T11:11:19.0617980Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T11:11:19.0646662Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T11:11:19.0671030Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T11:11:19.0698257Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T11:11:19.0727528Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T11:11:19.0758627Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T11:11:19.0799176Z Entering 'third_party/pocketfft' 2025-12-04T11:11:19.0824489Z Entering 'third_party/protobuf' 2025-12-04T11:11:19.0852695Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T11:11:19.0875438Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T11:11:19.0903160Z Entering 'third_party/psimd' 2025-12-04T11:11:19.0930948Z Entering 'third_party/pthreadpool' 2025-12-04T11:11:19.0962089Z Entering 'third_party/pybind11' 2025-12-04T11:11:19.0986811Z Entering 'third_party/python-peachpy' 2025-12-04T11:11:19.1006874Z Entering 'third_party/sleef' 2025-12-04T11:11:19.1029491Z Entering 'third_party/tensorpipe' 2025-12-04T11:11:19.1051913Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T11:11:19.1073092Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T11:11:19.1095104Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T11:11:19.1122488Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T11:11:19.1147989Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T11:11:19.1182175Z ##[endgroup] 2025-12-04T11:11:19.1465746Z [command]/usr/bin/git log -1 --format=%H 2025-12-04T11:11:19.1684611Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:11:19.1856409Z Prepare all required actions 2025-12-04T11:11:19.1856725Z Getting action download info 2025-12-04T11:11:19.4634623Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-12-04T11:11:20.2008121Z ##[group]Run ./.github/actions/setup-rocm 2025-12-04T11:11:20.2008320Z env: 2025-12-04T11:11:20.2008412Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.2008516Z ##[endgroup] 2025-12-04T11:11:20.2019620Z ##[group]Run dpkg -l | grep -E " rocm" 2025-12-04T11:11:20.2019757Z dpkg -l | grep -E " rocm" 2025-12-04T11:11:20.2023206Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.2023349Z env: 2025-12-04T11:11:20.2023438Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.2023544Z ##[endgroup] 2025-12-04T11:11:20.2087030Z ii rocm-cmake 0.14.0.60401-83~22.04 amd64 rocm-cmake built using CMake 2025-12-04T11:11:20.2087263Z ii rocm-core 6.4.1.60401-83~22.04 amd64 ROCm Runtime software stack 2025-12-04T11:11:20.2087509Z ii rocm-dbgapi 0.77.2.60401-83~22.04 amd64 Library to provide AMD GPU debugger API 2025-12-04T11:11:20.2087761Z ii rocm-debug-agent 2.0.4.60401-83~22.04 amd64 Radeon Open Compute Debug Agent (ROCdebug-agent) 2025-12-04T11:11:20.2088009Z ii rocm-dev 6.4.1.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-12-04T11:11:20.2088586Z ii rocm-device-libs 1.0.0.60401-83~22.04 amd64 Radeon Open Compute - device libraries 2025-12-04T11:11:20.2088793Z ii rocm-gdb 15.2.60401-83~22.04 amd64 ROCgdb 2025-12-04T11:11:20.2088989Z ii rocm-llvm 19.0.0.25184.60401-83~22.04 amd64 ROCm core compiler 2025-12-04T11:11:20.2089197Z ii rocm-opencl 2.0.0.60401-83~22.04 amd64 clr built using CMake 2025-12-04T11:11:20.2089539Z ii rocm-opencl-dev 2.0.0.60401-83~22.04 amd64 clr built using CMake 2025-12-04T11:11:20.2089887Z ii rocm-smi-lib 7.5.0.60401-83~22.04 amd64 AMD System Management libraries 2025-12-04T11:11:20.2090183Z ii rocm-utils 6.4.1.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-12-04T11:11:20.2090427Z ii rocminfo 1.0.0.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime rocminfo tool 2025-12-04T11:11:20.2109411Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T11:11:20.2109754Z # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T11:11:20.2109953Z # shellcheck disable=SC2046 2025-12-04T11:11:20.2110135Z docker stop $(docker ps -q) || true 2025-12-04T11:11:20.2110300Z # Prune all stopped containers. 2025-12-04T11:11:20.2110620Z docker container prune -f 2025-12-04T11:11:20.2115736Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.2115922Z env: 2025-12-04T11:11:20.2116036Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.2116178Z ##[endgroup] 2025-12-04T11:11:20.2375698Z docker: 'docker stop' requires at least 1 argument 2025-12-04T11:11:20.2376054Z 2025-12-04T11:11:20.2376246Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2025-12-04T11:11:20.2376523Z 2025-12-04T11:11:20.2376700Z See 'docker stop --help' for more information 2025-12-04T11:11:20.2478368Z Total reclaimed space: 0B 2025-12-04T11:11:20.2507005Z ##[group]Run cat /etc/os-release || true 2025-12-04T11:11:20.2507230Z cat /etc/os-release || true 2025-12-04T11:11:20.2507430Z cat /etc/apt/sources.list.d/rocm.list || true 2025-12-04T11:11:20.2507841Z cat /opt/rocm/.info/version || true 2025-12-04T11:11:20.2508019Z whoami 2025-12-04T11:11:20.2513465Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.2513633Z env: 2025-12-04T11:11:20.2513729Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.2513850Z ##[endgroup] 2025-12-04T11:11:20.2533855Z PRETTY_NAME="Ubuntu 22.04.5 LTS" 2025-12-04T11:11:20.2534172Z NAME="Ubuntu" 2025-12-04T11:11:20.2534370Z VERSION_ID="22.04" 2025-12-04T11:11:20.2534606Z VERSION="22.04.5 LTS (Jammy Jellyfish)" 2025-12-04T11:11:20.2534878Z VERSION_CODENAME=jammy 2025-12-04T11:11:20.2535084Z ID=ubuntu 2025-12-04T11:11:20.2535266Z ID_LIKE=debian 2025-12-04T11:11:20.2535503Z HOME_URL="https://www.ubuntu.com/" 2025-12-04T11:11:20.2535798Z SUPPORT_URL="https://help.ubuntu.com/" 2025-12-04T11:11:20.2536134Z BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" 2025-12-04T11:11:20.2536603Z PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" 2025-12-04T11:11:20.2537029Z UBUNTU_CODENAME=jammy 2025-12-04T11:11:20.2542256Z deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/6.4.1 jammy main 2025-12-04T11:11:20.2549659Z 6.4.1-83 2025-12-04T11:11:20.2555846Z runner 2025-12-04T11:11:20.2576487Z ##[group]Run dpkg -l | grep -E " amdgpu" 2025-12-04T11:11:20.2576688Z dpkg -l | grep -E " amdgpu" 2025-12-04T11:11:20.2581389Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.2581539Z env: 2025-12-04T11:11:20.2581629Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.2581734Z ##[endgroup] 2025-12-04T11:11:20.2629967Z ii amdgpu-core 1:6.4.60401-2164967.22.04 all Core meta package for unified amdgpu driver. 2025-12-04T11:11:20.2630220Z ii amdgpu-install 6.4.60401-2164967.22.04 all AMDGPU driver repository and installer 2025-12-04T11:11:20.2651408Z ##[group]Run rocm-smi 2025-12-04T11:11:20.2651586Z rocm-smi 2025-12-04T11:11:20.2656508Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.2656712Z env: 2025-12-04T11:11:20.2656817Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.2656925Z ##[endgroup] 2025-12-04T11:11:20.3387354Z 2025-12-04T11:11:20.3387461Z 2025-12-04T11:11:20.3387727Z ============================================ ROCm System Management Interface ============================================ 2025-12-04T11:11:20.3388007Z ====================================================== Concise Info ====================================================== 2025-12-04T11:11:20.3388316Z Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% 2025-12-04T11:11:20.3388969Z  (DID, GUID) (Junction) (Socket) (Mem, Compute, ID)  2025-12-04T11:11:20.3389205Z ========================================================================================================================== 2025-12-04T11:11:20.3389747Z 0 7 0x74a5, 26567 27.0°C 114.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T11:11:20.3390259Z 1 9 0x74a5, 43978 28.0°C 118.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T11:11:20.3390559Z 2 8 0x74a5, 20463 28.0°C 116.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T11:11:20.3390858Z 3 6 0x74a5, 33762 27.0°C 117.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T11:11:20.3391068Z ========================================================================================================================== 2025-12-04T11:11:20.3391258Z ================================================== End of ROCm SMI Log =================================================== 2025-12-04T11:11:20.3455049Z ##[group]Run rocminfo 2025-12-04T11:11:20.3455226Z rocminfo 2025-12-04T11:11:20.3460905Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.3461069Z env: 2025-12-04T11:11:20.3461196Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.3461308Z ##[endgroup] 2025-12-04T11:11:20.4449668Z ROCk module version 6.12.12 is loaded 2025-12-04T11:11:20.4449871Z ===================== 2025-12-04T11:11:20.4450084Z HSA System Attributes 2025-12-04T11:11:20.4450225Z ===================== 2025-12-04T11:11:20.4450365Z Runtime Version: 1.15 2025-12-04T11:11:20.4450541Z Runtime Ext Version: 1.7 2025-12-04T11:11:20.4450695Z System Timestamp Freq.: 1000.000000MHz 2025-12-04T11:11:20.4450955Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-12-04T11:11:20.4451311Z Machine Model: LARGE 2025-12-04T11:11:20.4451537Z System Endianness: LITTLE 2025-12-04T11:11:20.4451760Z Mwaitx: DISABLED 2025-12-04T11:11:20.4451920Z XNACK enabled: NO 2025-12-04T11:11:20.4452069Z DMAbuf Support: YES 2025-12-04T11:11:20.4452217Z VMM Support: YES 2025-12-04T11:11:20.4452312Z 2025-12-04T11:11:20.4452371Z ========== 2025-12-04T11:11:20.4475801Z HSA Agents 2025-12-04T11:11:20.4475974Z ========== 2025-12-04T11:11:20.4476076Z ******* 2025-12-04T11:11:20.4476183Z Agent 1 2025-12-04T11:11:20.4476281Z ******* 2025-12-04T11:11:20.4476415Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T11:11:20.4476619Z Uuid: CPU-XX 2025-12-04T11:11:20.4476779Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T11:11:20.4476995Z Vendor Name: CPU 2025-12-04T11:11:20.4477159Z Feature: None specified 2025-12-04T11:11:20.4477326Z Profile: FULL_PROFILE 2025-12-04T11:11:20.4477508Z Float Round Mode: NEAR 2025-12-04T11:11:20.4477681Z Max Queue Number: 0(0x0) 2025-12-04T11:11:20.4477846Z Queue Min Size: 0(0x0) 2025-12-04T11:11:20.4478006Z Queue Max Size: 0(0x0) 2025-12-04T11:11:20.4478221Z Queue Type: MULTI 2025-12-04T11:11:20.4478375Z Node: 0 2025-12-04T11:11:20.4478538Z Device Type: CPU 2025-12-04T11:11:20.4478694Z Cache Info: 2025-12-04T11:11:20.4478871Z L1: 49152(0xc000) KB 2025-12-04T11:11:20.4479014Z Chip ID: 0(0x0) 2025-12-04T11:11:20.4479168Z ASIC Revision: 0(0x0) 2025-12-04T11:11:20.4479349Z Cacheline Size: 64(0x40) 2025-12-04T11:11:20.4479505Z Max Clock Freq. (MHz): 3300 2025-12-04T11:11:20.4479833Z BDFID: 0 2025-12-04T11:11:20.4480011Z Internal Node ID: 0 2025-12-04T11:11:20.4480258Z Compute Unit: 128 2025-12-04T11:11:20.4480427Z SIMDs per CU: 0 2025-12-04T11:11:20.4480589Z Shader Engines: 0 2025-12-04T11:11:20.4480746Z Shader Arrs. per Eng.: 0 2025-12-04T11:11:20.4480916Z WatchPts on Addr. Ranges:1 2025-12-04T11:11:20.4481076Z Memory Properties: 2025-12-04T11:11:20.4481197Z Features: None 2025-12-04T11:11:20.4481316Z Pool Info: 2025-12-04T11:11:20.4481514Z Pool 1 2025-12-04T11:11:20.4481660Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:11:20.4481832Z Size: 1584755152(0x5e7571d0) KB 2025-12-04T11:11:20.4481995Z Allocatable: TRUE 2025-12-04T11:11:20.4482161Z Alloc Granule: 4KB 2025-12-04T11:11:20.4482327Z Alloc Recommended Granule:4KB 2025-12-04T11:11:20.4482499Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4482672Z Accessible by all: TRUE 2025-12-04T11:11:20.4482813Z Pool 2 2025-12-04T11:11:20.4482978Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:11:20.4483159Z Size: 1584755152(0x5e7571d0) KB 2025-12-04T11:11:20.4483351Z Allocatable: TRUE 2025-12-04T11:11:20.4483516Z Alloc Granule: 4KB 2025-12-04T11:11:20.4483681Z Alloc Recommended Granule:4KB 2025-12-04T11:11:20.4483854Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4484021Z Accessible by all: TRUE 2025-12-04T11:11:20.4484158Z Pool 3 2025-12-04T11:11:20.4484309Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T11:11:20.4484499Z Size: 1584755152(0x5e7571d0) KB 2025-12-04T11:11:20.4484648Z Allocatable: TRUE 2025-12-04T11:11:20.4484811Z Alloc Granule: 4KB 2025-12-04T11:11:20.4484990Z Alloc Recommended Granule:4KB 2025-12-04T11:11:20.4485159Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4485328Z Accessible by all: TRUE 2025-12-04T11:11:20.4485510Z Pool 4 2025-12-04T11:11:20.4485651Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:11:20.4485813Z Size: 1584755152(0x5e7571d0) KB 2025-12-04T11:11:20.4485982Z Allocatable: TRUE 2025-12-04T11:11:20.4486171Z Alloc Granule: 4KB 2025-12-04T11:11:20.4486341Z Alloc Recommended Granule:4KB 2025-12-04T11:11:20.4486504Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4486681Z Accessible by all: TRUE 2025-12-04T11:11:20.4486819Z ISA Info: 2025-12-04T11:11:20.4486931Z ******* 2025-12-04T11:11:20.4487065Z Agent 2 2025-12-04T11:11:20.4487197Z ******* 2025-12-04T11:11:20.4487322Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T11:11:20.4487518Z Uuid: CPU-XX 2025-12-04T11:11:20.4487691Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T11:11:20.4487861Z Vendor Name: CPU 2025-12-04T11:11:20.4488014Z Feature: None specified 2025-12-04T11:11:20.4488236Z Profile: FULL_PROFILE 2025-12-04T11:11:20.4488400Z Float Round Mode: NEAR 2025-12-04T11:11:20.4488557Z Max Queue Number: 0(0x0) 2025-12-04T11:11:20.4488724Z Queue Min Size: 0(0x0) 2025-12-04T11:11:20.4488881Z Queue Max Size: 0(0x0) 2025-12-04T11:11:20.4489077Z Queue Type: MULTI 2025-12-04T11:11:20.4489244Z Node: 1 2025-12-04T11:11:20.4489402Z Device Type: CPU 2025-12-04T11:11:20.4489552Z Cache Info: 2025-12-04T11:11:20.4489719Z L1: 49152(0xc000) KB 2025-12-04T11:11:20.4489889Z Chip ID: 0(0x0) 2025-12-04T11:11:20.4490043Z ASIC Revision: 0(0x0) 2025-12-04T11:11:20.4490242Z Cacheline Size: 64(0x40) 2025-12-04T11:11:20.4490417Z Max Clock Freq. (MHz): 3300 2025-12-04T11:11:20.4490572Z BDFID: 0 2025-12-04T11:11:20.4490721Z Internal Node ID: 1 2025-12-04T11:11:20.4490881Z Compute Unit: 128 2025-12-04T11:11:20.4491039Z SIMDs per CU: 0 2025-12-04T11:11:20.4491191Z Shader Engines: 0 2025-12-04T11:11:20.4491360Z Shader Arrs. per Eng.: 0 2025-12-04T11:11:20.4491529Z WatchPts on Addr. Ranges:1 2025-12-04T11:11:20.4491673Z Memory Properties: 2025-12-04T11:11:20.4491789Z Features: None 2025-12-04T11:11:20.4491899Z Pool Info: 2025-12-04T11:11:20.4492010Z Pool 1 2025-12-04T11:11:20.4492149Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:11:20.4492300Z Size: 1585284308(0x5e7d84d4) KB 2025-12-04T11:11:20.4492457Z Allocatable: TRUE 2025-12-04T11:11:20.4492621Z Alloc Granule: 4KB 2025-12-04T11:11:20.4492785Z Alloc Recommended Granule:4KB 2025-12-04T11:11:20.4492958Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4493124Z Accessible by all: TRUE 2025-12-04T11:11:20.4493261Z Pool 2 2025-12-04T11:11:20.4493399Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:11:20.4493551Z Size: 1585284308(0x5e7d84d4) KB 2025-12-04T11:11:20.4493705Z Allocatable: TRUE 2025-12-04T11:11:20.4493869Z Alloc Granule: 4KB 2025-12-04T11:11:20.4494031Z Alloc Recommended Granule:4KB 2025-12-04T11:11:20.4494201Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4494367Z Accessible by all: TRUE 2025-12-04T11:11:20.4494507Z Pool 3 2025-12-04T11:11:20.4494643Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T11:11:20.4494829Z Size: 1585284308(0x5e7d84d4) KB 2025-12-04T11:11:20.4494981Z Allocatable: TRUE 2025-12-04T11:11:20.4495133Z Alloc Granule: 4KB 2025-12-04T11:11:20.4495285Z Alloc Recommended Granule:4KB 2025-12-04T11:11:20.4495441Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4495593Z Accessible by all: TRUE 2025-12-04T11:11:20.4495722Z Pool 4 2025-12-04T11:11:20.4495846Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:11:20.4495986Z Size: 1585284308(0x5e7d84d4) KB 2025-12-04T11:11:20.4496173Z Allocatable: TRUE 2025-12-04T11:11:20.4496327Z Alloc Granule: 4KB 2025-12-04T11:11:20.4496483Z Alloc Recommended Granule:4KB 2025-12-04T11:11:20.4496639Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4496794Z Accessible by all: TRUE 2025-12-04T11:11:20.4496924Z ISA Info: 2025-12-04T11:11:20.4497019Z ******* 2025-12-04T11:11:20.4497111Z Agent 3 2025-12-04T11:11:20.4497204Z ******* 2025-12-04T11:11:20.4497309Z Name: gfx942 2025-12-04T11:11:20.4497443Z Uuid: GPU-e92b40ee81585045 2025-12-04T11:11:20.4497590Z Marketing Name: AMD Instinct MI325X 2025-12-04T11:11:20.4497739Z Vendor Name: AMD 2025-12-04T11:11:20.4497883Z Feature: KERNEL_DISPATCH 2025-12-04T11:11:20.4498027Z Profile: BASE_PROFILE 2025-12-04T11:11:20.4498213Z Float Round Mode: NEAR 2025-12-04T11:11:20.4498364Z Max Queue Number: 128(0x80) 2025-12-04T11:11:20.4498509Z Queue Min Size: 64(0x40) 2025-12-04T11:11:20.4498650Z Queue Max Size: 131072(0x20000) 2025-12-04T11:11:20.4498797Z Queue Type: MULTI 2025-12-04T11:11:20.4498934Z Node: 2 2025-12-04T11:11:20.4499068Z Device Type: GPU 2025-12-04T11:11:20.4499195Z Cache Info: 2025-12-04T11:11:20.4499302Z L1: 32(0x20) KB 2025-12-04T11:11:20.4499428Z L2: 4096(0x1000) KB 2025-12-04T11:11:20.4499560Z L3: 262144(0x40000) KB 2025-12-04T11:11:20.4499689Z Chip ID: 29861(0x74a5) 2025-12-04T11:11:20.4499831Z ASIC Revision: 1(0x1) 2025-12-04T11:11:20.4499983Z Cacheline Size: 128(0x80) 2025-12-04T11:11:20.4500127Z Max Clock Freq. (MHz): 2100 2025-12-04T11:11:20.4500269Z BDFID: 62720 2025-12-04T11:11:20.4500414Z Internal Node ID: 2 2025-12-04T11:11:20.4500558Z Compute Unit: 304 2025-12-04T11:11:20.4500705Z SIMDs per CU: 4 2025-12-04T11:11:20.4500853Z Shader Engines: 32 2025-12-04T11:11:20.4501008Z Shader Arrs. per Eng.: 1 2025-12-04T11:11:20.4501171Z WatchPts on Addr. Ranges:4 2025-12-04T11:11:20.4501378Z Coherent Host Access: FALSE 2025-12-04T11:11:20.4501519Z Memory Properties: 2025-12-04T11:11:20.4501634Z Features: KERNEL_DISPATCH 2025-12-04T11:11:20.4501768Z Fast F16 Operation: TRUE 2025-12-04T11:11:20.4501919Z Wavefront Size: 64(0x40) 2025-12-04T11:11:20.4502065Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4502202Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4502319Z x 1024(0x400) 2025-12-04T11:11:20.4502439Z y 1024(0x400) 2025-12-04T11:11:20.4502559Z z 1024(0x400) 2025-12-04T11:11:20.4502694Z Max Waves Per CU: 32(0x20) 2025-12-04T11:11:20.4502875Z Max Work-item Per CU: 2048(0x800) 2025-12-04T11:11:20.4503037Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4503169Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4503277Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4503401Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4503525Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4503664Z Max fbarriers/Workgrp: 32 2025-12-04T11:11:20.4509248Z Packet Processor uCode:: 185 2025-12-04T11:11:20.4509413Z SDMA engine uCode:: 24 2025-12-04T11:11:20.4509566Z IOMMU Support:: None 2025-12-04T11:11:20.4509699Z Pool Info: 2025-12-04T11:11:20.4509799Z Pool 1 2025-12-04T11:11:20.4509932Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:11:20.4510081Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4510230Z Allocatable: TRUE 2025-12-04T11:11:20.4510384Z Alloc Granule: 4KB 2025-12-04T11:11:20.4510540Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4510699Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4510856Z Accessible by all: FALSE 2025-12-04T11:11:20.4510986Z Pool 2 2025-12-04T11:11:20.4511113Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:11:20.4511258Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4511398Z Allocatable: TRUE 2025-12-04T11:11:20.4511550Z Alloc Granule: 4KB 2025-12-04T11:11:20.4511707Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4511863Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4512017Z Accessible by all: FALSE 2025-12-04T11:11:20.4512147Z Pool 3 2025-12-04T11:11:20.4512268Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:11:20.4512411Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4512551Z Allocatable: TRUE 2025-12-04T11:11:20.4512703Z Alloc Granule: 4KB 2025-12-04T11:11:20.4512860Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4513016Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4513171Z Accessible by all: FALSE 2025-12-04T11:11:20.4513300Z Pool 4 2025-12-04T11:11:20.4513505Z Segment: GROUP 2025-12-04T11:11:20.4513638Z Size: 64(0x40) KB 2025-12-04T11:11:20.4513775Z Allocatable: FALSE 2025-12-04T11:11:20.4513922Z Alloc Granule: 0KB 2025-12-04T11:11:20.4514078Z Alloc Recommended Granule:0KB 2025-12-04T11:11:20.4514233Z Alloc Alignment: 0KB 2025-12-04T11:11:20.4514387Z Accessible by all: FALSE 2025-12-04T11:11:20.4514518Z ISA Info: 2025-12-04T11:11:20.4514618Z ISA 1 2025-12-04T11:11:20.4514787Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T11:11:20.4514948Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:11:20.4515109Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:11:20.4515265Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4515420Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4515571Z Fast f16: TRUE 2025-12-04T11:11:20.4515719Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4515857Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4515985Z x 1024(0x400) 2025-12-04T11:11:20.4516114Z y 1024(0x400) 2025-12-04T11:11:20.4516241Z z 1024(0x400) 2025-12-04T11:11:20.4516386Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4516517Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4516635Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4516767Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4516888Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4517034Z FBarrier Max Size: 32 2025-12-04T11:11:20.4517165Z ISA 2 2025-12-04T11:11:20.4517304Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T11:11:20.4517480Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:11:20.4517634Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:11:20.4517785Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4517941Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4518087Z Fast f16: TRUE 2025-12-04T11:11:20.4518270Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4518409Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4518533Z x 1024(0x400) 2025-12-04T11:11:20.4518659Z y 1024(0x400) 2025-12-04T11:11:20.4518780Z z 1024(0x400) 2025-12-04T11:11:20.4518918Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4519056Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4519172Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4519303Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4519431Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4519574Z FBarrier Max Size: 32 2025-12-04T11:11:20.4519709Z ******* 2025-12-04T11:11:20.4519848Z Agent 4 2025-12-04T11:11:20.4519947Z ******* 2025-12-04T11:11:20.4520061Z Name: gfx942 2025-12-04T11:11:20.4520202Z Uuid: GPU-0f23c118dd1bca7f 2025-12-04T11:11:20.4520358Z Marketing Name: AMD Instinct MI325X 2025-12-04T11:11:20.4520517Z Vendor Name: AMD 2025-12-04T11:11:20.4520664Z Feature: KERNEL_DISPATCH 2025-12-04T11:11:20.4520815Z Profile: BASE_PROFILE 2025-12-04T11:11:20.4520963Z Float Round Mode: NEAR 2025-12-04T11:11:20.4521118Z Max Queue Number: 128(0x80) 2025-12-04T11:11:20.4521310Z Queue Min Size: 64(0x40) 2025-12-04T11:11:20.4521457Z Queue Max Size: 131072(0x20000) 2025-12-04T11:11:20.4521610Z Queue Type: MULTI 2025-12-04T11:11:20.4521757Z Node: 3 2025-12-04T11:11:20.4521897Z Device Type: GPU 2025-12-04T11:11:20.4522032Z Cache Info: 2025-12-04T11:11:20.4522145Z L1: 32(0x20) KB 2025-12-04T11:11:20.4522283Z L2: 4096(0x1000) KB 2025-12-04T11:11:20.4522416Z L3: 262144(0x40000) KB 2025-12-04T11:11:20.4522548Z Chip ID: 29861(0x74a5) 2025-12-04T11:11:20.4522695Z ASIC Revision: 1(0x1) 2025-12-04T11:11:20.4522851Z Cacheline Size: 128(0x80) 2025-12-04T11:11:20.4523002Z Max Clock Freq. (MHz): 2100 2025-12-04T11:11:20.4523154Z BDFID: 34048 2025-12-04T11:11:20.4523295Z Internal Node ID: 3 2025-12-04T11:11:20.4523447Z Compute Unit: 304 2025-12-04T11:11:20.4523595Z SIMDs per CU: 4 2025-12-04T11:11:20.4523742Z Shader Engines: 32 2025-12-04T11:11:20.4523898Z Shader Arrs. per Eng.: 1 2025-12-04T11:11:20.4524055Z WatchPts on Addr. Ranges:4 2025-12-04T11:11:20.4524212Z Coherent Host Access: FALSE 2025-12-04T11:11:20.4524353Z Memory Properties: 2025-12-04T11:11:20.4524464Z Features: KERNEL_DISPATCH 2025-12-04T11:11:20.4524610Z Fast F16 Operation: TRUE 2025-12-04T11:11:20.4524767Z Wavefront Size: 64(0x40) 2025-12-04T11:11:20.4524919Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4525061Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4525184Z x 1024(0x400) 2025-12-04T11:11:20.4525308Z y 1024(0x400) 2025-12-04T11:11:20.4525433Z z 1024(0x400) 2025-12-04T11:11:20.4525571Z Max Waves Per CU: 32(0x20) 2025-12-04T11:11:20.4525722Z Max Work-item Per CU: 2048(0x800) 2025-12-04T11:11:20.4525876Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4526008Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4526124Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4526256Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4526382Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4527137Z Max fbarriers/Workgrp: 32 2025-12-04T11:11:20.4527303Z Packet Processor uCode:: 185 2025-12-04T11:11:20.4527460Z SDMA engine uCode:: 24 2025-12-04T11:11:20.4527616Z IOMMU Support:: None 2025-12-04T11:11:20.4527750Z Pool Info: 2025-12-04T11:11:20.4527859Z Pool 1 2025-12-04T11:11:20.4527989Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:11:20.4528137Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4528328Z Allocatable: TRUE 2025-12-04T11:11:20.4528484Z Alloc Granule: 4KB 2025-12-04T11:11:20.4528686Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4528854Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4529016Z Accessible by all: FALSE 2025-12-04T11:11:20.4529152Z Pool 2 2025-12-04T11:11:20.4529281Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:11:20.4529426Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4529573Z Allocatable: TRUE 2025-12-04T11:11:20.4529727Z Alloc Granule: 4KB 2025-12-04T11:11:20.4529884Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4530047Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4530204Z Accessible by all: FALSE 2025-12-04T11:11:20.4530340Z Pool 3 2025-12-04T11:11:20.4530473Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:11:20.4530618Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4530764Z Allocatable: TRUE 2025-12-04T11:11:20.4530919Z Alloc Granule: 4KB 2025-12-04T11:11:20.4531076Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4531239Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4531395Z Accessible by all: FALSE 2025-12-04T11:11:20.4531526Z Pool 4 2025-12-04T11:11:20.4531646Z Segment: GROUP 2025-12-04T11:11:20.4531783Z Size: 64(0x40) KB 2025-12-04T11:11:20.4531926Z Allocatable: FALSE 2025-12-04T11:11:20.4532074Z Alloc Granule: 0KB 2025-12-04T11:11:20.4532235Z Alloc Recommended Granule:0KB 2025-12-04T11:11:20.4532390Z Alloc Alignment: 0KB 2025-12-04T11:11:20.4532546Z Accessible by all: FALSE 2025-12-04T11:11:20.4532683Z ISA Info: 2025-12-04T11:11:20.4532779Z ISA 1 2025-12-04T11:11:20.4532908Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T11:11:20.4533068Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:11:20.4533232Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:11:20.4533396Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4533559Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4533710Z Fast f16: TRUE 2025-12-04T11:11:20.4533898Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4534043Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4534173Z x 1024(0x400) 2025-12-04T11:11:20.4534301Z y 1024(0x400) 2025-12-04T11:11:20.4534433Z z 1024(0x400) 2025-12-04T11:11:20.4534574Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4534707Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4534826Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4534954Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4535109Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4535254Z FBarrier Max Size: 32 2025-12-04T11:11:20.4535389Z ISA 2 2025-12-04T11:11:20.4535531Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T11:11:20.4535704Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:11:20.4535859Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:11:20.4536019Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4536182Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4536330Z Fast f16: TRUE 2025-12-04T11:11:20.4536480Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4536618Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4536744Z x 1024(0x400) 2025-12-04T11:11:20.4536874Z y 1024(0x400) 2025-12-04T11:11:20.4536999Z z 1024(0x400) 2025-12-04T11:11:20.4537141Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4537279Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4537395Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4537527Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4537652Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4537796Z FBarrier Max Size: 32 2025-12-04T11:11:20.4537928Z ******* 2025-12-04T11:11:20.4538023Z Agent 5 2025-12-04T11:11:20.4538122Z ******* 2025-12-04T11:11:20.4538263Z Name: gfx942 2025-12-04T11:11:20.4538407Z Uuid: GPU-1385052698a87313 2025-12-04T11:11:20.4538561Z Marketing Name: AMD Instinct MI325X 2025-12-04T11:11:20.4538717Z Vendor Name: AMD 2025-12-04T11:11:20.4538867Z Feature: KERNEL_DISPATCH 2025-12-04T11:11:20.4539017Z Profile: BASE_PROFILE 2025-12-04T11:11:20.4539166Z Float Round Mode: NEAR 2025-12-04T11:11:20.4539321Z Max Queue Number: 128(0x80) 2025-12-04T11:11:20.4539472Z Queue Min Size: 64(0x40) 2025-12-04T11:11:20.4539619Z Queue Max Size: 131072(0x20000) 2025-12-04T11:11:20.4539769Z Queue Type: MULTI 2025-12-04T11:11:20.4539907Z Node: 4 2025-12-04T11:11:20.4540053Z Device Type: GPU 2025-12-04T11:11:20.4540187Z Cache Info: 2025-12-04T11:11:20.4540348Z L1: 32(0x20) KB 2025-12-04T11:11:20.4540481Z L2: 4096(0x1000) KB 2025-12-04T11:11:20.4540614Z L3: 262144(0x40000) KB 2025-12-04T11:11:20.4540746Z Chip ID: 29861(0x74a5) 2025-12-04T11:11:20.4540894Z ASIC Revision: 1(0x1) 2025-12-04T11:11:20.4541047Z Cacheline Size: 128(0x80) 2025-12-04T11:11:20.4541195Z Max Clock Freq. (MHz): 2100 2025-12-04T11:11:20.4541340Z BDFID: 58624 2025-12-04T11:11:20.4541484Z Internal Node ID: 4 2025-12-04T11:11:20.4541679Z Compute Unit: 304 2025-12-04T11:11:20.4541828Z SIMDs per CU: 4 2025-12-04T11:11:20.4541983Z Shader Engines: 32 2025-12-04T11:11:20.4542142Z Shader Arrs. per Eng.: 1 2025-12-04T11:11:20.4542301Z WatchPts on Addr. Ranges:4 2025-12-04T11:11:20.4542460Z Coherent Host Access: FALSE 2025-12-04T11:11:20.4542602Z Memory Properties: 2025-12-04T11:11:20.4542718Z Features: KERNEL_DISPATCH 2025-12-04T11:11:20.4542867Z Fast F16 Operation: TRUE 2025-12-04T11:11:20.4543026Z Wavefront Size: 64(0x40) 2025-12-04T11:11:20.4543180Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4543327Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4543459Z x 1024(0x400) 2025-12-04T11:11:20.4543589Z y 1024(0x400) 2025-12-04T11:11:20.4543722Z z 1024(0x400) 2025-12-04T11:11:20.4543859Z Max Waves Per CU: 32(0x20) 2025-12-04T11:11:20.4544023Z Max Work-item Per CU: 2048(0x800) 2025-12-04T11:11:20.4544179Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4544315Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4544435Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4544567Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4544694Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4544846Z Max fbarriers/Workgrp: 32 2025-12-04T11:11:20.4545012Z Packet Processor uCode:: 185 2025-12-04T11:11:20.4545181Z SDMA engine uCode:: 24 2025-12-04T11:11:20.4545342Z IOMMU Support:: None 2025-12-04T11:11:20.4545479Z Pool Info: 2025-12-04T11:11:20.4545590Z Pool 1 2025-12-04T11:11:20.4545725Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:11:20.4545875Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4546030Z Allocatable: TRUE 2025-12-04T11:11:20.4546194Z Alloc Granule: 4KB 2025-12-04T11:11:20.4546360Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4546527Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4546686Z Accessible by all: FALSE 2025-12-04T11:11:20.4546827Z Pool 2 2025-12-04T11:11:20.4546967Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:11:20.4547115Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4547298Z Allocatable: TRUE 2025-12-04T11:11:20.4547459Z Alloc Granule: 4KB 2025-12-04T11:11:20.4547621Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4547789Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4547946Z Accessible by all: FALSE 2025-12-04T11:11:20.4548088Z Pool 3 2025-12-04T11:11:20.4548271Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:11:20.4548416Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4548568Z Allocatable: TRUE 2025-12-04T11:11:20.4548775Z Alloc Granule: 4KB 2025-12-04T11:11:20.4548937Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4549111Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4549277Z Accessible by all: FALSE 2025-12-04T11:11:20.4549415Z Pool 4 2025-12-04T11:11:20.4549546Z Segment: GROUP 2025-12-04T11:11:20.4549686Z Size: 64(0x40) KB 2025-12-04T11:11:20.4549838Z Allocatable: FALSE 2025-12-04T11:11:20.4549999Z Alloc Granule: 0KB 2025-12-04T11:11:20.4550158Z Alloc Recommended Granule:0KB 2025-12-04T11:11:20.4550324Z Alloc Alignment: 0KB 2025-12-04T11:11:20.4550490Z Accessible by all: FALSE 2025-12-04T11:11:20.4550625Z ISA Info: 2025-12-04T11:11:20.4550734Z ISA 1 2025-12-04T11:11:20.4550860Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T11:11:20.4551025Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:11:20.4551186Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:11:20.4551342Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4551507Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4551659Z Fast f16: TRUE 2025-12-04T11:11:20.4551807Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4551951Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4552078Z x 1024(0x400) 2025-12-04T11:11:20.4552211Z y 1024(0x400) 2025-12-04T11:11:20.4552342Z z 1024(0x400) 2025-12-04T11:11:20.4552480Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4552622Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4552743Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4552872Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4553002Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4553147Z FBarrier Max Size: 32 2025-12-04T11:11:20.4553278Z ISA 2 2025-12-04T11:11:20.4553415Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T11:11:20.4553584Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:11:20.4553741Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:11:20.4553942Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4554101Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4554257Z Fast f16: TRUE 2025-12-04T11:11:20.4554408Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4554698Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4554961Z x 1024(0x400) 2025-12-04T11:11:20.4555217Z y 1024(0x400) 2025-12-04T11:11:20.4555350Z z 1024(0x400) 2025-12-04T11:11:20.4555487Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4555630Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4555792Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4555921Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4556057Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4556204Z FBarrier Max Size: 32 2025-12-04T11:11:20.4556339Z ******* 2025-12-04T11:11:20.4556439Z Agent 6 2025-12-04T11:11:20.4556534Z ******* 2025-12-04T11:11:20.4556649Z Name: gfx942 2025-12-04T11:11:20.4556799Z Uuid: GPU-7b47bcc6019ee30a 2025-12-04T11:11:20.4556953Z Marketing Name: AMD Instinct MI325X 2025-12-04T11:11:20.4557116Z Vendor Name: AMD 2025-12-04T11:11:20.4557268Z Feature: KERNEL_DISPATCH 2025-12-04T11:11:20.4557426Z Profile: BASE_PROFILE 2025-12-04T11:11:20.4557581Z Float Round Mode: NEAR 2025-12-04T11:11:20.4557736Z Max Queue Number: 128(0x80) 2025-12-04T11:11:20.4557889Z Queue Min Size: 64(0x40) 2025-12-04T11:11:20.4558040Z Queue Max Size: 131072(0x20000) 2025-12-04T11:11:20.4558221Z Queue Type: MULTI 2025-12-04T11:11:20.4558367Z Node: 5 2025-12-04T11:11:20.4558506Z Device Type: GPU 2025-12-04T11:11:20.4558639Z Cache Info: 2025-12-04T11:11:20.4558754Z L1: 32(0x20) KB 2025-12-04T11:11:20.4558882Z L2: 4096(0x1000) KB 2025-12-04T11:11:20.4559015Z L3: 262144(0x40000) KB 2025-12-04T11:11:20.4559151Z Chip ID: 29861(0x74a5) 2025-12-04T11:11:20.4559296Z ASIC Revision: 1(0x1) 2025-12-04T11:11:20.4559449Z Cacheline Size: 128(0x80) 2025-12-04T11:11:20.4559598Z Max Clock Freq. (MHz): 2100 2025-12-04T11:11:20.4559745Z BDFID: 38144 2025-12-04T11:11:20.4559892Z Internal Node ID: 5 2025-12-04T11:11:20.4560040Z Compute Unit: 304 2025-12-04T11:11:20.4560191Z SIMDs per CU: 4 2025-12-04T11:11:20.4560343Z Shader Engines: 32 2025-12-04T11:11:20.4560496Z Shader Arrs. per Eng.: 1 2025-12-04T11:11:20.4560658Z WatchPts on Addr. Ranges:4 2025-12-04T11:11:20.4560819Z Coherent Host Access: FALSE 2025-12-04T11:11:20.4561010Z Memory Properties: 2025-12-04T11:11:20.4561127Z Features: KERNEL_DISPATCH 2025-12-04T11:11:20.4561266Z Fast F16 Operation: TRUE 2025-12-04T11:11:20.4561426Z Wavefront Size: 64(0x40) 2025-12-04T11:11:20.4561583Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4561722Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4561851Z x 1024(0x400) 2025-12-04T11:11:20.4561980Z y 1024(0x400) 2025-12-04T11:11:20.4562102Z z 1024(0x400) 2025-12-04T11:11:20.4562241Z Max Waves Per CU: 32(0x20) 2025-12-04T11:11:20.4562435Z Max Work-item Per CU: 2048(0x800) 2025-12-04T11:11:20.4562591Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4562731Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4562843Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4562974Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4563105Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4563247Z Max fbarriers/Workgrp: 32 2025-12-04T11:11:20.4563412Z Packet Processor uCode:: 185 2025-12-04T11:11:20.4563570Z SDMA engine uCode:: 24 2025-12-04T11:11:20.4563726Z IOMMU Support:: None 2025-12-04T11:11:20.4563866Z Pool Info: 2025-12-04T11:11:20.4563967Z Pool 1 2025-12-04T11:11:20.4564098Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:11:20.4564254Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4564403Z Allocatable: TRUE 2025-12-04T11:11:20.4564560Z Alloc Granule: 4KB 2025-12-04T11:11:20.4564724Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4564884Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4565045Z Accessible by all: FALSE 2025-12-04T11:11:20.4565178Z Pool 2 2025-12-04T11:11:20.4565309Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:11:20.4565458Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4565601Z Allocatable: TRUE 2025-12-04T11:11:20.4565758Z Alloc Granule: 4KB 2025-12-04T11:11:20.4565920Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4566080Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4566240Z Accessible by all: FALSE 2025-12-04T11:11:20.4566373Z Pool 3 2025-12-04T11:11:20.4566500Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:11:20.4566647Z Size: 268419072(0xfffc000) KB 2025-12-04T11:11:20.4566789Z Allocatable: TRUE 2025-12-04T11:11:20.4566943Z Alloc Granule: 4KB 2025-12-04T11:11:20.4567104Z Alloc Recommended Granule:2048KB 2025-12-04T11:11:20.4567263Z Alloc Alignment: 4KB 2025-12-04T11:11:20.4567420Z Accessible by all: FALSE 2025-12-04T11:11:20.4567558Z Pool 4 2025-12-04T11:11:20.4567678Z Segment: GROUP 2025-12-04T11:11:20.4567858Z Size: 64(0x40) KB 2025-12-04T11:11:20.4568000Z Allocatable: FALSE 2025-12-04T11:11:20.4568196Z Alloc Granule: 0KB 2025-12-04T11:11:20.4568358Z Alloc Recommended Granule:0KB 2025-12-04T11:11:20.4568516Z Alloc Alignment: 0KB 2025-12-04T11:11:20.4568674Z Accessible by all: FALSE 2025-12-04T11:11:20.4568811Z ISA Info: 2025-12-04T11:11:20.4568911Z ISA 1 2025-12-04T11:11:20.4569043Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T11:11:20.4569247Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:11:20.4569410Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:11:20.4569577Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4569739Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4569892Z Fast f16: TRUE 2025-12-04T11:11:20.4570045Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4570185Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4570314Z x 1024(0x400) 2025-12-04T11:11:20.4570441Z y 1024(0x400) 2025-12-04T11:11:20.4570570Z z 1024(0x400) 2025-12-04T11:11:20.4570712Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4570847Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4570974Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4571111Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4571240Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4571389Z FBarrier Max Size: 32 2025-12-04T11:11:20.4571524Z ISA 2 2025-12-04T11:11:20.4571663Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T11:11:20.4571838Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:11:20.4572003Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:11:20.4572166Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4572348Z Default Rounding Mode: NEAR 2025-12-04T11:11:20.4572502Z Fast f16: TRUE 2025-12-04T11:11:20.4572654Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:11:20.4572799Z Workgroup Max Size per Dimension: 2025-12-04T11:11:20.4572922Z x 1024(0x400) 2025-12-04T11:11:20.4573052Z y 1024(0x400) 2025-12-04T11:11:20.4573177Z z 1024(0x400) 2025-12-04T11:11:20.4573314Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:11:20.4573610Z Grid Max Size per Dimension: 2025-12-04T11:11:20.4573728Z x 4294967295(0xffffffff) 2025-12-04T11:11:20.4573861Z y 4294967295(0xffffffff) 2025-12-04T11:11:20.4573990Z z 4294967295(0xffffffff) 2025-12-04T11:11:20.4574130Z FBarrier Max Size: 32 2025-12-04T11:11:20.4574269Z *** Done *** 2025-12-04T11:11:20.4584137Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T11:11:20.4584506Z ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T11:11:20.4584783Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-12-04T11:11:20.4585046Z if [[ $ngpu -eq 0 ]]; then 2025-12-04T11:11:20.4585193Z  echo "Error: Failed to detect any GPUs on the runner" 2025-12-04T11:11:20.4585332Z  echo "$msg" 2025-12-04T11:11:20.4585432Z  exit 1 2025-12-04T11:11:20.4585524Z fi 2025-12-04T11:11:20.4588276Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.4588421Z env: 2025-12-04T11:11:20.4588506Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.4588610Z ##[endgroup] 2025-12-04T11:11:20.5718540Z ##[group]Run pytorch/pytorch/.github/actions/diskspace-cleanup@main 2025-12-04T11:11:20.5718728Z with: 2025-12-04T11:11:20.5718866Z diskspace-cutoff: 70 2025-12-04T11:11:20.5718979Z env: 2025-12-04T11:11:20.5719080Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.5719187Z ##[endgroup] 2025-12-04T11:11:20.5746807Z ##[group]Run set -ex 2025-12-04T11:11:20.5747033Z set -ex 2025-12-04T11:11:20.5747150Z diskspace_cutoff=70 2025-12-04T11:11:20.5772791Z docker_root_dir=$(docker info -f '{{.DockerRootDir}}') 2025-12-04T11:11:20.5772967Z if [ ! -d "$docker_root_dir" ]; then 2025-12-04T11:11:20.5773180Z  echo "Docker root directory ($docker_root_dir) does not exist. Skipping disk space check." 2025-12-04T11:11:20.5773376Z  exit 0 2025-12-04T11:11:20.5773471Z fi 2025-12-04T11:11:20.5773645Z diskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-12-04T11:11:20.5773982Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-12-04T11:11:20.5774265Z if [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then 2025-12-04T11:11:20.5774425Z  docker system prune -af 2025-12-04T11:11:20.5774620Z  diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-12-04T11:11:20.5774839Z  if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then 2025-12-04T11:11:20.5775009Z  diskspace_cutoff_int=$((diskspace_cutoff + 0)) 2025-12-04T11:11:20.5775166Z  difference=$((100 - diskspace_cutoff_int)) 2025-12-04T11:11:20.5775379Z  echo "Error: Available diskspace is less than $difference percent. Not enough diskspace." 2025-12-04T11:11:20.5775574Z  echo "$msg" 2025-12-04T11:11:20.5775680Z  exit 1 2025-12-04T11:11:20.5775783Z  else 2025-12-04T11:11:20.5775905Z  difference=$((diskspace - diskspace_new)) 2025-12-04T11:11:20.5776060Z  echo "Diskspace saved: $difference percent" 2025-12-04T11:11:20.5776203Z  fi 2025-12-04T11:11:20.5776289Z fi 2025-12-04T11:11:20.5780987Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.5781144Z env: 2025-12-04T11:11:20.5781238Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.5781347Z ##[endgroup] 2025-12-04T11:11:20.5799546Z + diskspace_cutoff=70 2025-12-04T11:11:20.5802160Z ++ docker info -f '{{.DockerRootDir}}' 2025-12-04T11:11:20.6186386Z + docker_root_dir=/home/runner/docker-data 2025-12-04T11:11:20.6186643Z + '[' '!' -d /home/runner/docker-data ']' 2025-12-04T11:11:20.6194765Z ++ df -H --output=pcent /home/runner/docker-data 2025-12-04T11:11:20.6195227Z ++ sed -n 2p 2025-12-04T11:11:20.6195467Z ++ sed s/%// 2025-12-04T11:11:20.6195696Z ++ sed 's/ //' 2025-12-04T11:11:20.6210979Z + diskspace=' 4' 2025-12-04T11:11:20.6211579Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified' 2025-12-04T11:11:20.6212663Z + [[ 4 -ge 70 ]] 2025-12-04T11:11:20.6239027Z ##[group]Run RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-12-04T11:11:20.6239266Z RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-12-04T11:11:20.6239443Z rm -rf "${RUNNER_ARTIFACT_DIR}" 2025-12-04T11:11:20.6239593Z mkdir -p "${RUNNER_ARTIFACT_DIR}" 2025-12-04T11:11:20.6239781Z echo "RUNNER_ARTIFACT_DIR=${RUNNER_ARTIFACT_DIR}" >> "${GITHUB_ENV}" 2025-12-04T11:11:20.6239958Z  2025-12-04T11:11:20.6240098Z RUNNER_TEST_RESULTS_DIR="${RUNNER_TEMP}/test-results" 2025-12-04T11:11:20.6240275Z rm -rf "${RUNNER_TEST_RESULTS_DIR}" 2025-12-04T11:11:20.6240422Z mkdir -p "${RUNNER_TEST_RESULTS_DIR}" 2025-12-04T11:11:20.6240623Z echo "RUNNER_TEST_RESULTS_DIR=${RUNNER_TEST_RESULTS_DIR}" >> "${GITHUB_ENV}" 2025-12-04T11:11:20.6240815Z  2025-12-04T11:11:20.6241119Z RUNNER_DOCS_DIR="${RUNNER_TEMP}/docs" 2025-12-04T11:11:20.6241265Z rm -rf "${RUNNER_DOCS_DIR}" 2025-12-04T11:11:20.6241412Z mkdir -p "${RUNNER_DOCS_DIR}" 2025-12-04T11:11:20.6241573Z echo "RUNNER_DOCS_DIR=${RUNNER_DOCS_DIR}" >> "${GITHUB_ENV}" 2025-12-04T11:11:20.6245923Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.6246064Z env: 2025-12-04T11:11:20.6246153Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.6246254Z ##[endgroup] 2025-12-04T11:11:20.6330757Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T11:11:20.6331071Z env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T11:11:20.6331329Z env | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T11:11:20.6335727Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.6335888Z env: 2025-12-04T11:11:20.6335996Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.6336144Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:20.6336341Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:20.6336520Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:20.6336662Z ##[endgroup] 2025-12-04T11:11:20.6389407Z ##[group]Run # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-12-04T11:11:20.6389737Z # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-12-04T11:11:20.6389953Z # Add render group for container creation. 2025-12-04T11:11:20.6390131Z render_gid=`cat /etc/group | grep render | cut -d: -f3` 2025-12-04T11:11:20.6390340Z # Ensure GPU isolation if pod is part of kubernetes setup with DEVICE_FLAG. 2025-12-04T11:11:20.6390552Z if [ -f "/etc/podinfo/gha-render-devices" ]; then 2025-12-04T11:11:20.6390740Z  DEVICE_FLAG=$(cat /etc/podinfo/gha-render-devices) 2025-12-04T11:11:20.6390889Z else 2025-12-04T11:11:20.6390996Z  DEVICE_FLAG="--device /dev/dri" 2025-12-04T11:11:20.6391130Z fi 2025-12-04T11:11:20.6391317Z # The --group-add daemon and --group-add bin are needed in the Ubuntu 24.04 and Almalinux OSs respectively. 2025-12-04T11:11:20.6391595Z # This is due to the device files (/dev/kfd & /dev/dri) being owned by video group on bare metal. 2025-12-04T11:11:20.6391849Z # This video group ID maps to subgid 1 inside the docker image due to the /etc/subgid entries. 2025-12-04T11:11:20.6392121Z # The group name corresponding to group ID 1 can change depending on the OS, so both are necessary. 2025-12-04T11:11:20.6392568Z echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd $DEVICE_FLAG --group-add video --group-add $render_gid --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host" >> "${GITHUB_ENV}" 2025-12-04T11:11:20.6397205Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:20.6397493Z env: 2025-12-04T11:11:20.6397585Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.6397713Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:20.6397887Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:20.6398047Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:20.6398221Z ##[endgroup] 2025-12-04T11:11:20.6476243Z ##[group]Run aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 2025-12-04T11:11:20.6476505Z with: 2025-12-04T11:11:20.6476700Z role-to-assume: arn:aws:iam::308535385114:role/gha_workflow_s3_and_ecr_read_only 2025-12-04T11:11:20.6476926Z aws-region: us-east-1 2025-12-04T11:11:20.6477067Z role-duration-seconds: 18000 2025-12-04T11:11:20.6477227Z audience: sts.amazonaws.com 2025-12-04T11:11:20.6477369Z env: 2025-12-04T11:11:20.6477484Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:20.6477761Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:20.6477980Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:20.6478274Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:20.6478939Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:20.6479576Z ##[endgroup] 2025-12-04T11:11:20.9482319Z Assuming role with OIDC 2025-12-04T11:11:21.3016143Z Authenticated as assumedRoleId AROAUPVRELQNLLCOPFEJR:GitHubActions 2025-12-04T11:11:21.3966083Z ##[group]Run aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076 2025-12-04T11:11:21.3966306Z with: 2025-12-04T11:11:21.3966414Z mask-password: true 2025-12-04T11:11:21.3966549Z registry-type: private 2025-12-04T11:11:21.3966679Z skip-logout: false 2025-12-04T11:11:21.3966789Z env: 2025-12-04T11:11:21.3966902Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:21.3967061Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:21.3967262Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:21.3967453Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:21.3968034Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:21.3968818Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:21.3968950Z AWS_REGION: us-east-1 2025-12-04T11:11:21.3969344Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:21.3969517Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:21.3971663Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:21.3971770Z ##[endgroup] 2025-12-04T11:11:21.8171748Z Logging into registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:22.4450604Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T11:11:22.4450891Z env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T11:11:22.4451138Z env | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T11:11:22.4451364Z env | grep '^RUNNER' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T11:11:22.4456329Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:22.4456521Z env: 2025-12-04T11:11:22.4456642Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:22.4456824Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:22.4457047Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:22.4457258Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:22.4457921Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:22.4458731Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:22.4458881Z AWS_REGION: us-east-1 2025-12-04T11:11:22.4459126Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:22.4459317Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:22.4461670Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:22.4461784Z ##[endgroup] 2025-12-04T11:11:22.4548507Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T11:11:22.4548706Z ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T11:11:22.4548960Z if [[ $ngpu -lt 2 ]]; then #We are temporarily reducing this down to 2 from 4 so that we can run tests on nodes with less gpus. 2025-12-04T11:11:22.4549253Z  echo "Error: only $ngpu GPU(s) detected, at least 2 GPUs are needed for distributed jobs" 2025-12-04T11:11:22.4549445Z  exit 1 2025-12-04T11:11:22.4549545Z fi 2025-12-04T11:11:22.4552373Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:22.4552521Z env: 2025-12-04T11:11:22.4552622Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:22.4552769Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:22.4552953Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:22.4553124Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:22.4553655Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:22.4554159Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:22.4554285Z AWS_REGION: us-east-1 2025-12-04T11:11:22.4554451Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:22.4554614Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:22.4556638Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:22.4556752Z ##[endgroup] 2025-12-04T11:11:22.5694269Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-12-04T11:11:22.5694448Z with: 2025-12-04T11:11:22.5694726Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:22.5695031Z use-custom-docker-registry: true 2025-12-04T11:11:22.5695158Z docker-build-dir: .ci/docker 2025-12-04T11:11:22.5695279Z docker-build-script: ./build.sh 2025-12-04T11:11:22.5695398Z working-directory: . 2025-12-04T11:11:22.5695537Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:22.5695691Z force-push: false 2025-12-04T11:11:22.5695785Z env: 2025-12-04T11:11:22.5695876Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:22.5696011Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:22.5696188Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:22.5696360Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:22.5696866Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:22.5697356Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:22.5697469Z AWS_REGION: us-east-1 2025-12-04T11:11:22.5697612Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:22.5697762Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:22.5699829Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:22.5699951Z ##[endgroup] 2025-12-04T11:11:22.5708220Z ##[group]Run set -ex 2025-12-04T11:11:22.5708345Z set -ex 2025-12-04T11:11:22.5708438Z  2025-12-04T11:11:22.5708688Z # If the docker build directory or the build script doesn't exist, the action will 2025-12-04T11:11:22.5708937Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-12-04T11:11:22.5709148Z # job could then download the pre-built image as usual 2025-12-04T11:11:22.5709401Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-12-04T11:11:22.5709636Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-12-04T11:11:22.5709764Z else 2025-12-04T11:11:22.5709871Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-12-04T11:11:22.5710044Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-12-04T11:11:22.5710193Z  2025-12-04T11:11:22.5710399Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-12-04T11:11:22.5710629Z  exit 0 2025-12-04T11:11:22.5710720Z fi 2025-12-04T11:11:22.5710807Z  2025-12-04T11:11:22.5710942Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-12-04T11:11:22.5711173Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-12-04T11:11:22.5711372Z  # use it as it is, but first let's extract the tag 2025-12-04T11:11:22.5711558Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-12-04T11:11:22.5711751Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T11:11:22.5711940Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-12-04T11:11:22.5712094Z else 2025-12-04T11:11:22.5712205Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-12-04T11:11:22.5712358Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-12-04T11:11:22.5712515Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-12-04T11:11:22.5712645Z  fi 2025-12-04T11:11:22.5712879Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-12-04T11:11:22.5713106Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T11:11:22.5713346Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T11:11:22.5713603Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-12-04T11:11:22.5713763Z fi 2025-12-04T11:11:22.5716421Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:22.5716563Z env: 2025-12-04T11:11:22.5716660Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:22.5716805Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:22.5716984Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:22.5717153Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:22.5717658Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:22.5718182Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:22.5718300Z AWS_REGION: us-east-1 2025-12-04T11:11:22.5718439Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:22.5718594Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:22.5720596Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:22.5720700Z REPO_NAME: pytorch 2025-12-04T11:11:22.5720979Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:22.5721278Z DOCKER_BUILD_DIR: .ci/docker 2025-12-04T11:11:22.5721443Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-12-04T11:11:22.5721597Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:22.5721760Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-12-04T11:11:22.5721881Z CUSTOM_TAG_PREFIX: 2025-12-04T11:11:22.5721985Z ##[endgroup] 2025-12-04T11:11:22.5737746Z + [[ -d .ci/docker ]] 2025-12-04T11:11:22.5737922Z + [[ -f .ci/docker/./build.sh ]] 2025-12-04T11:11:22.5738063Z + [[ true == \t\r\u\e ]] 2025-12-04T11:11:22.5738684Z + echo skip=false 2025-12-04T11:11:22.5739162Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-12-04T11:11:22.5746858Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:22.5749135Z ++ awk -F '[:,]' '{print $2}' 2025-12-04T11:11:22.5762813Z + DOCKER_TAG=pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:22.5763412Z + echo docker-tag=pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:22.5764149Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:22.5792688Z ##[group]Run set +e 2025-12-04T11:11:22.5792878Z set +e 2025-12-04T11:11:22.5792999Z set -x 2025-12-04T11:11:22.5793120Z  2025-12-04T11:11:22.5793229Z login() { 2025-12-04T11:11:22.5793461Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-12-04T11:11:22.5793698Z } 2025-12-04T11:11:22.5793815Z  2025-12-04T11:11:22.5793923Z retry () { 2025-12-04T11:11:22.5794060Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-12-04T11:11:22.5794209Z } 2025-12-04T11:11:22.5794339Z  2025-12-04T11:11:22.5794457Z retry login "${DOCKER_REGISTRY}" 2025-12-04T11:11:22.5794601Z  2025-12-04T11:11:22.5794876Z START_TIME=$(date +%s) 2025-12-04T11:11:22.5795025Z # Wait up to 120 minutes 2025-12-04T11:11:22.5795200Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-12-04T11:11:22.5795432Z  # Check if image already exists, if it does then skip building it 2025-12-04T11:11:22.5795650Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-12-04T11:11:22.5795821Z  exit 0 2025-12-04T11:11:22.5795940Z  fi 2025-12-04T11:11:22.5796050Z  2025-12-04T11:11:22.5796227Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-12-04T11:11:22.5796518Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-12-04T11:11:22.5796806Z  # latter, it will wait for the Docker images to become available before continuing 2025-12-04T11:11:22.5797041Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-12-04T11:11:22.5797234Z  # It's a Docker build job, let's build the image 2025-12-04T11:11:22.5797396Z  break 2025-12-04T11:11:22.5797511Z  else 2025-12-04T11:11:22.5797666Z  # It's a regular build job, wait for the image to become available 2025-12-04T11:11:22.5797842Z  sleep 300 2025-12-04T11:11:22.5797958Z  fi 2025-12-04T11:11:22.5798057Z done 2025-12-04T11:11:22.5798335Z  2025-12-04T11:11:22.5798484Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-12-04T11:11:22.5798706Z # be empty. The default action would be to continue rebuild the image 2025-12-04T11:11:22.5798908Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-12-04T11:11:22.5799090Z  # if we're on the base branch then use the parent commit 2025-12-04T11:11:22.5799376Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-12-04T11:11:22.5799516Z else 2025-12-04T11:11:22.5799654Z  # otherwise we're on a PR, so use the most recent base commit 2025-12-04T11:11:22.5799843Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-12-04T11:11:22.5799989Z fi 2025-12-04T11:11:22.5800089Z  2025-12-04T11:11:22.5800194Z if [[ -z "${MERGE_BASE}" ]]; then 2025-12-04T11:11:22.5800342Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-12-04T11:11:22.5800480Z  2025-12-04T11:11:22.5800677Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-12-04T11:11:22.5800885Z  exit 0 2025-12-04T11:11:22.5800986Z fi 2025-12-04T11:11:22.5801075Z  2025-12-04T11:11:22.5801207Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-12-04T11:11:22.5801474Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-12-04T11:11:22.5801700Z  exit 1 2025-12-04T11:11:22.5801793Z fi 2025-12-04T11:11:22.5801888Z  2025-12-04T11:11:22.5802037Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-12-04T11:11:22.5802310Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-12-04T11:11:22.5802534Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-12-04T11:11:22.5802797Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-12-04T11:11:22.5803079Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-12-04T11:11:22.5803255Z fi 2025-12-04T11:11:22.5803346Z  2025-12-04T11:11:22.5803462Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-12-04T11:11:22.5808128Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:22.5808390Z env: 2025-12-04T11:11:22.5808492Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:22.5808636Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:22.5808818Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:22.5808989Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:22.5809503Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:22.5810008Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:22.5810129Z AWS_REGION: us-east-1 2025-12-04T11:11:22.5810362Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:22.5810523Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:22.5812571Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:22.5812691Z DOCKER_BUILD_DIR: .ci/docker 2025-12-04T11:11:22.5812835Z BASE_REVISION: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:11:22.5813158Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:22.5813529Z DOCKER_TAG: pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:22.5813771Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:22.5813923Z DOCKER_PUSH: 2025-12-04T11:11:22.5814027Z ##[endgroup] 2025-12-04T11:11:22.5832270Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:22.5832467Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:22.5834922Z + aws ecr get-login-password --region us-east-1 2025-12-04T11:11:22.5835845Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:22.5836214Z /home/runner/_work/_temp/884c5434-78da-4dd3-af27-7ddeb9346173.sh: line 5: aws: command not found 2025-12-04T11:11:22.5935646Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T11:11:22.5943636Z + sleep 1 2025-12-04T11:11:23.5953529Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:23.5957808Z + aws ecr get-login-password --region us-east-1 2025-12-04T11:11:23.5958278Z /home/runner/_work/_temp/884c5434-78da-4dd3-af27-7ddeb9346173.sh: line 5: aws: command not found 2025-12-04T11:11:23.5958726Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:23.6071996Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T11:11:23.6086562Z + sleep 2 2025-12-04T11:11:25.6097178Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:25.6103142Z + aws ecr get-login-password --region us-east-1 2025-12-04T11:11:25.6103976Z /home/runner/_work/_temp/884c5434-78da-4dd3-af27-7ddeb9346173.sh: line 5: aws: command not found 2025-12-04T11:11:25.6104763Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:25.6211188Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T11:11:25.6225304Z ++ date +%s 2025-12-04T11:11:25.6236242Z + START_TIME=1764846685 2025-12-04T11:11:25.6241092Z ++ date +%s 2025-12-04T11:11:25.6251013Z + [[ 1764839485 -lt 1764846685 ]] 2025-12-04T11:11:25.6251585Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:26.9741662Z { 2025-12-04T11:11:26.9742073Z "schemaVersion": 2, 2025-12-04T11:11:26.9742638Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-12-04T11:11:26.9743130Z "config": { 2025-12-04T11:11:26.9743516Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-12-04T11:11:26.9744016Z "size": 30522, 2025-12-04T11:11:26.9744482Z "digest": "sha256:79498ef00fdf8abfcde955fd685c3a7412c33ca80383b5905abfdc3c70621215" 2025-12-04T11:11:26.9745731Z }, 2025-12-04T11:11:26.9745964Z "layers": [ 2025-12-04T11:11:26.9746194Z { 2025-12-04T11:11:26.9746556Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9746992Z "size": 30594402, 2025-12-04T11:11:26.9747448Z "digest": "sha256:02de03a7213b62b792ec66a7efb8c86c4117ca00fb8651facf8ecfe33044b485" 2025-12-04T11:11:26.9747804Z }, 2025-12-04T11:11:26.9747956Z { 2025-12-04T11:11:26.9748458Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9748755Z "size": 1554, 2025-12-04T11:11:26.9749055Z "digest": "sha256:3a5718b5258e28918133dd74ea64bd506b2c15530a2fa8a72c45c5b0d8f7c7b0" 2025-12-04T11:11:26.9749384Z }, 2025-12-04T11:11:26.9749529Z { 2025-12-04T11:11:26.9749773Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9750079Z "size": 335779211, 2025-12-04T11:11:26.9750400Z "digest": "sha256:bf3aa22776924a41b55849f0f30cb22af45d41da1177a9d682cf94cde99d8f98" 2025-12-04T11:11:26.9750738Z }, 2025-12-04T11:11:26.9750887Z { 2025-12-04T11:11:26.9751129Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9751423Z "size": 704, 2025-12-04T11:11:26.9751717Z "digest": "sha256:9d58e5257cefd43e8226153d71d28a865253662146aa9fce9a9f95af67b497fa" 2025-12-04T11:11:26.9752038Z }, 2025-12-04T11:11:26.9752185Z { 2025-12-04T11:11:26.9752423Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9752712Z "size": 1770, 2025-12-04T11:11:26.9753007Z "digest": "sha256:fde80a64553533a56c032d4bc388837e7d4631a0424d1bfe135703165b67fd4d" 2025-12-04T11:11:26.9753330Z }, 2025-12-04T11:11:26.9753477Z { 2025-12-04T11:11:26.9753715Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9754186Z "size": 485, 2025-12-04T11:11:26.9754671Z "digest": "sha256:6931c5f20e80e481e4f484471ff3a02878b4f8c54a9a5a4717213fdaa35c0bff" 2025-12-04T11:11:26.9754994Z }, 2025-12-04T11:11:26.9755147Z { 2025-12-04T11:11:26.9755385Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9755677Z "size": 120663474, 2025-12-04T11:11:26.9755993Z "digest": "sha256:170ea6d3edd62991e37d2e6ebe53dfcd4601f5d42e8f9720af5f8db5fc267856" 2025-12-04T11:11:26.9756323Z }, 2025-12-04T11:11:26.9756471Z { 2025-12-04T11:11:26.9756710Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9756999Z "size": 4433, 2025-12-04T11:11:26.9757266Z "digest": "sha256:dc8487f6c81cac00fa33031f8d3481e2c3634c4f064a9c4c36b87b41e78bc9fb" 2025-12-04T11:11:26.9757507Z }, 2025-12-04T11:11:26.9757616Z { 2025-12-04T11:11:26.9757791Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9758003Z "size": 1755, 2025-12-04T11:11:26.9758284Z "digest": "sha256:9748c5348f39a11c960c49fd9219fdea1c23e612ed11a02d71501424defc80f5" 2025-12-04T11:11:26.9758527Z }, 2025-12-04T11:11:26.9758632Z { 2025-12-04T11:11:26.9758814Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9759027Z "size": 724, 2025-12-04T11:11:26.9759246Z "digest": "sha256:8539cc3f8d8a138501ed0255c0cd7ec491bc0add9e4a62095f1c0f9533daa1cc" 2025-12-04T11:11:26.9759486Z }, 2025-12-04T11:11:26.9759596Z { 2025-12-04T11:11:26.9759800Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9760016Z "size": 3378352584, 2025-12-04T11:11:26.9760250Z "digest": "sha256:af88f886884fe6f1a1992efb7ce8473901f795eef69caa199443f3e076fdfd5b" 2025-12-04T11:11:26.9760578Z }, 2025-12-04T11:11:26.9760865Z { 2025-12-04T11:11:26.9761141Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9761356Z "size": 396, 2025-12-04T11:11:26.9761576Z "digest": "sha256:32fbb88555c4195c45c7008cf92e389d67acc79a7e382503003ef93bcb886afe" 2025-12-04T11:11:26.9761822Z }, 2025-12-04T11:11:26.9761933Z { 2025-12-04T11:11:26.9762189Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9762417Z "size": 80171601, 2025-12-04T11:11:26.9762662Z "digest": "sha256:3231e1ab814b143b244037c540b637be259085834865ac43b1ed2b6f6ad631e1" 2025-12-04T11:11:26.9762898Z }, 2025-12-04T11:11:26.9763010Z { 2025-12-04T11:11:26.9763187Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9763400Z "size": 787, 2025-12-04T11:11:26.9763623Z "digest": "sha256:80061bf5dcbb9a4e38ac865a9cdc0a615bb294e3e6bfa357a6d515dcf3f54abc" 2025-12-04T11:11:26.9763871Z }, 2025-12-04T11:11:26.9763981Z { 2025-12-04T11:11:26.9764155Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9764367Z "size": 106, 2025-12-04T11:11:26.9764586Z "digest": "sha256:6e9524f4518ec02b47ff12c55b6b6afbc65b3f4be59072e2afe20c2c87522549" 2025-12-04T11:11:26.9764832Z }, 2025-12-04T11:11:26.9764938Z { 2025-12-04T11:11:26.9765120Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9765355Z "size": 1495, 2025-12-04T11:11:26.9765572Z "digest": "sha256:ce919d4bf5eeff71d49b160a16603117225530497c3905e02224227d11e2ff88" 2025-12-04T11:11:26.9765810Z }, 2025-12-04T11:11:26.9765920Z { 2025-12-04T11:11:26.9766095Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9766310Z "size": 548601195, 2025-12-04T11:11:26.9766534Z "digest": "sha256:47681e3e6f37423139a5c86549ffbb43e4f258344b0461208f5821263da152e9" 2025-12-04T11:11:26.9766769Z }, 2025-12-04T11:11:26.9766877Z { 2025-12-04T11:11:26.9767052Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9767250Z "size": 162, 2025-12-04T11:11:26.9767427Z "digest": "sha256:cb70fe22c9ebacebfe8402519059c8a66da6d5a77979e4c0ecdb3a762bebe357" 2025-12-04T11:11:26.9767675Z }, 2025-12-04T11:11:26.9767764Z { 2025-12-04T11:11:26.9767905Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9768083Z "size": 104, 2025-12-04T11:11:26.9768305Z "digest": "sha256:17858e829c8cfe9a7e22516e03ad5273d8cf5c50f58edb10ff60c74e15c8e1f6" 2025-12-04T11:11:26.9768498Z }, 2025-12-04T11:11:26.9768588Z { 2025-12-04T11:11:26.9768727Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9768897Z "size": 724, 2025-12-04T11:11:26.9769072Z "digest": "sha256:8539cc3f8d8a138501ed0255c0cd7ec491bc0add9e4a62095f1c0f9533daa1cc" 2025-12-04T11:11:26.9769263Z }, 2025-12-04T11:11:26.9769354Z { 2025-12-04T11:11:26.9769496Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9769667Z "size": 196, 2025-12-04T11:11:26.9769843Z "digest": "sha256:a63f3b4eed1157bcb3c51b64196e74e9f10d1f923652b02fd433c6ed993597ff" 2025-12-04T11:11:26.9770038Z }, 2025-12-04T11:11:26.9770130Z { 2025-12-04T11:11:26.9770277Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9770448Z "size": 2584, 2025-12-04T11:11:26.9770635Z "digest": "sha256:10ab3d1afbc4cb2d3ced8f3e0072c0b1dd124dcadcf68b95fadf8a7a9f663860" 2025-12-04T11:11:26.9770831Z }, 2025-12-04T11:11:26.9770920Z { 2025-12-04T11:11:26.9771061Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9771234Z "size": 7652105336, 2025-12-04T11:11:26.9771418Z "digest": "sha256:98ca88b5095b449a2f2d753a21217856271912fbe51c2d99f928a2196f4097d5" 2025-12-04T11:11:26.9771609Z }, 2025-12-04T11:11:26.9771698Z { 2025-12-04T11:11:26.9771841Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9772012Z "size": 135, 2025-12-04T11:11:26.9772184Z "digest": "sha256:025c90839a58c768b3cc444e48cae67c1a5b2c85320ad8827231f0ba390cf9aa" 2025-12-04T11:11:26.9772374Z }, 2025-12-04T11:11:26.9772466Z { 2025-12-04T11:11:26.9772606Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9772780Z "size": 104, 2025-12-04T11:11:26.9773023Z "digest": "sha256:9255df5942ae69fee24f8074314f451d5d2f1ca71b6c777274297fd43a0032d8" 2025-12-04T11:11:26.9773212Z }, 2025-12-04T11:11:26.9773303Z { 2025-12-04T11:11:26.9773442Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9773612Z "size": 612, 2025-12-04T11:11:26.9773788Z "digest": "sha256:f71ca9d4ed1c4ca8177602f3cb0db83d9787ea6c258a8ef203387b308ff3e0f0" 2025-12-04T11:11:26.9773980Z }, 2025-12-04T11:11:26.9774067Z { 2025-12-04T11:11:26.9774206Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9774371Z "size": 838191953, 2025-12-04T11:11:26.9774552Z "digest": "sha256:d02b47b56ca7f3598f5943d4fdc7139d5e3d3bc82d49185cedf9817dd55fc75c" 2025-12-04T11:11:26.9774738Z }, 2025-12-04T11:11:26.9774824Z { 2025-12-04T11:11:26.9774959Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9775126Z "size": 111, 2025-12-04T11:11:26.9775299Z "digest": "sha256:40279492aea7bc8fb650842b495912195621c21b14cef4c717a9e0a9fc535131" 2025-12-04T11:11:26.9775483Z }, 2025-12-04T11:11:26.9775568Z { 2025-12-04T11:11:26.9775699Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9775864Z "size": 1556, 2025-12-04T11:11:26.9776035Z "digest": "sha256:33a27ce74abd7e32a03a564fc45005bc75904b53ad516f18d47facbeb2f2794e" 2025-12-04T11:11:26.9776225Z }, 2025-12-04T11:11:26.9776311Z { 2025-12-04T11:11:26.9776452Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9776622Z "size": 107, 2025-12-04T11:11:26.9776795Z "digest": "sha256:6b66ed335d1d8df6140caba76d9c2babed83bb37962e1e638825d49e67184fa5" 2025-12-04T11:11:26.9776985Z }, 2025-12-04T11:11:26.9777074Z { 2025-12-04T11:11:26.9777210Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9777411Z "size": 166, 2025-12-04T11:11:26.9777573Z "digest": "sha256:9f010fa04118bfee2d7b4481e6badb714032bde0652b04151a6599e57e1bd91b" 2025-12-04T11:11:26.9777751Z }, 2025-12-04T11:11:26.9777842Z { 2025-12-04T11:11:26.9777973Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9778132Z "size": 3702493, 2025-12-04T11:11:26.9778408Z "digest": "sha256:6c64d5e8bb6ae6ef4e3f1d316429d8b14a6e8a1fb410fb83b96c8bbd4a0a095c" 2025-12-04T11:11:26.9778590Z }, 2025-12-04T11:11:26.9778674Z { 2025-12-04T11:11:26.9778804Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9778962Z "size": 107, 2025-12-04T11:11:26.9779130Z "digest": "sha256:c20ea058f549f5f5538c95c5e0da23afbbc9fb7ffc1987d126fe684eeed743f5" 2025-12-04T11:11:26.9779314Z }, 2025-12-04T11:11:26.9779399Z { 2025-12-04T11:11:26.9779530Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9779691Z "size": 829, 2025-12-04T11:11:26.9779855Z "digest": "sha256:3c4fd2d54638a1336d39769fe36041aa6d186a8dea0e7096b8d8a7068ba0d3c0" 2025-12-04T11:11:26.9780034Z }, 2025-12-04T11:11:26.9780117Z { 2025-12-04T11:11:26.9780249Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9780407Z "size": 26673844, 2025-12-04T11:11:26.9780575Z "digest": "sha256:964ebac3d7a95c64ea7f0d828cd58e6244cc955e9a099a2525079ecf64026e3f" 2025-12-04T11:11:26.9780753Z }, 2025-12-04T11:11:26.9780831Z { 2025-12-04T11:11:26.9780960Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9781119Z "size": 104, 2025-12-04T11:11:26.9781284Z "digest": "sha256:2aaa7210673fc5bd15d36e54ee5c3fb495d1eafa1cb8d686054ccedb1c37bfc8" 2025-12-04T11:11:26.9781468Z }, 2025-12-04T11:11:26.9781552Z { 2025-12-04T11:11:26.9781682Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9781843Z "size": 424, 2025-12-04T11:11:26.9782005Z "digest": "sha256:fa273daa00371a98ed668535e14b8cc3cb425feba0b601b3e3c72314d0234312" 2025-12-04T11:11:26.9782190Z }, 2025-12-04T11:11:26.9782275Z { 2025-12-04T11:11:26.9782452Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9782612Z "size": 19279582, 2025-12-04T11:11:26.9782784Z "digest": "sha256:d931a62fd2408369decfa0e6eac11768e35d0ffddee87d769c82aaf1ad7e2899" 2025-12-04T11:11:26.9782966Z }, 2025-12-04T11:11:26.9783050Z { 2025-12-04T11:11:26.9783181Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9783342Z "size": 826, 2025-12-04T11:11:26.9783504Z "digest": "sha256:d3573d61c28e1400840260d3c2c786c9e104f6558162beac799e55b6f5c1e747" 2025-12-04T11:11:26.9783677Z }, 2025-12-04T11:11:26.9783761Z { 2025-12-04T11:11:26.9783893Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9784051Z "size": 724, 2025-12-04T11:11:26.9784213Z "digest": "sha256:8539cc3f8d8a138501ed0255c0cd7ec491bc0add9e4a62095f1c0f9533daa1cc" 2025-12-04T11:11:26.9784395Z }, 2025-12-04T11:11:26.9784481Z { 2025-12-04T11:11:26.9784611Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9784778Z "size": 149, 2025-12-04T11:11:26.9784939Z "digest": "sha256:f9b32f08c49055dd61bd359d5f42f6adb9e5a183c2821d97d11572dd7ce1e91f" 2025-12-04T11:11:26.9785120Z }, 2025-12-04T11:11:26.9785208Z { 2025-12-04T11:11:26.9785341Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9785500Z "size": 136, 2025-12-04T11:11:26.9785654Z "digest": "sha256:3a0206399d60f6e8897f78c8e8f81b59d51969a329ef45485d28ae19607ca72c" 2025-12-04T11:11:26.9785829Z }, 2025-12-04T11:11:26.9785912Z { 2025-12-04T11:11:26.9786043Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9786200Z "size": 140, 2025-12-04T11:11:26.9786360Z "digest": "sha256:386f322edd1c1c275126bab065c22fcd3950916c1fb8491a21a7f5c358af599a" 2025-12-04T11:11:26.9786537Z }, 2025-12-04T11:11:26.9786677Z { 2025-12-04T11:11:26.9786806Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9786963Z "size": 32, 2025-12-04T11:11:26.9787130Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T11:11:26.9787309Z }, 2025-12-04T11:11:26.9787394Z { 2025-12-04T11:11:26.9787527Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9787686Z "size": 223, 2025-12-04T11:11:26.9787846Z "digest": "sha256:bbe49df30697f6959cd958299909d9255cd54663ce2e9e2c2d378f8f9dfe8345" 2025-12-04T11:11:26.9788025Z }, 2025-12-04T11:11:26.9788109Z { 2025-12-04T11:11:26.9788279Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9788438Z "size": 346, 2025-12-04T11:11:26.9788598Z "digest": "sha256:d6630aa6f375b12cb7471c5b60eb32e02ff8d70adf4497e061d6c15fead186c7" 2025-12-04T11:11:26.9788782Z }, 2025-12-04T11:11:26.9788866Z { 2025-12-04T11:11:26.9789007Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9789163Z "size": 88302, 2025-12-04T11:11:26.9789328Z "digest": "sha256:6d807afc1309592c99c7d77af3874afb54c1718377fe721ac0cc616f59d291b9" 2025-12-04T11:11:26.9789499Z }, 2025-12-04T11:11:26.9789576Z { 2025-12-04T11:11:26.9789701Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9789854Z "size": 106, 2025-12-04T11:11:26.9790007Z "digest": "sha256:60b679430e4e0b7690392dfe4f5dc417847f7a3ba2b761ce747b66d412e1d956" 2025-12-04T11:11:26.9790178Z }, 2025-12-04T11:11:26.9790257Z { 2025-12-04T11:11:26.9790380Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9790532Z "size": 1671, 2025-12-04T11:11:26.9790692Z "digest": "sha256:3992ae84f9eda1c5c52fa96b1f1d0fc3f93c661c5cf0b971a504a260c290da49" 2025-12-04T11:11:26.9790865Z }, 2025-12-04T11:11:26.9790943Z { 2025-12-04T11:11:26.9791069Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9791224Z "size": 724, 2025-12-04T11:11:26.9791423Z "digest": "sha256:8539cc3f8d8a138501ed0255c0cd7ec491bc0add9e4a62095f1c0f9533daa1cc" 2025-12-04T11:11:26.9791598Z }, 2025-12-04T11:11:26.9791675Z { 2025-12-04T11:11:26.9791800Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9791952Z "size": 138, 2025-12-04T11:11:26.9792110Z "digest": "sha256:62d400609f9c38fce4745f72372423072ba0f142b3c03775ccb317f6c5240966" 2025-12-04T11:11:26.9792279Z }, 2025-12-04T11:11:26.9792356Z { 2025-12-04T11:11:26.9792486Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9792643Z "size": 119, 2025-12-04T11:11:26.9792801Z "digest": "sha256:7e7b097490967d568331cc9f8afdd02422fe101c6364ec5e12dba2970991e533" 2025-12-04T11:11:26.9793062Z }, 2025-12-04T11:11:26.9793179Z { 2025-12-04T11:11:26.9793356Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9807092Z "size": 6231259865, 2025-12-04T11:11:26.9807293Z "digest": "sha256:7dcdbd8421cb17aaa5d0cb965ddf94e196cb364e762b12ab78024cb25e3b6bcd" 2025-12-04T11:11:26.9807487Z }, 2025-12-04T11:11:26.9807576Z { 2025-12-04T11:11:26.9807719Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9807884Z "size": 174, 2025-12-04T11:11:26.9808049Z "digest": "sha256:cbb12613719bab9f179968227f9fb8881251992804e460b9a9e1c00f3ac4a0c5" 2025-12-04T11:11:26.9808277Z }, 2025-12-04T11:11:26.9808365Z { 2025-12-04T11:11:26.9808498Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9808659Z "size": 1896, 2025-12-04T11:11:26.9808825Z "digest": "sha256:e87038dce9bc8e13bd64006847d30ddcaf77455256c4985fccfec83f82d4b925" 2025-12-04T11:11:26.9809004Z }, 2025-12-04T11:11:26.9809088Z { 2025-12-04T11:11:26.9809222Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9809384Z "size": 162783968, 2025-12-04T11:11:26.9809627Z "digest": "sha256:e4606b636f96f1c80f4be26aeb9d6f5f990f6149789c2de160451c5ac76a467d" 2025-12-04T11:11:26.9809806Z }, 2025-12-04T11:11:26.9809889Z { 2025-12-04T11:11:26.9810021Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9810181Z "size": 302, 2025-12-04T11:11:26.9810342Z "digest": "sha256:6f2a5d33b946e561219b9968769773e36ce1d28bee8c62eff652098b7825fc79" 2025-12-04T11:11:26.9810518Z }, 2025-12-04T11:11:26.9810602Z { 2025-12-04T11:11:26.9810734Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9810892Z "size": 32, 2025-12-04T11:11:26.9811056Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T11:11:26.9811237Z }, 2025-12-04T11:11:26.9811319Z { 2025-12-04T11:11:26.9811448Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9811605Z "size": 108, 2025-12-04T11:11:26.9811765Z "digest": "sha256:a4f2bf2f19e63b91d46f2d9cf11a25c657517a6835996404da1e79a09d918b0e" 2025-12-04T11:11:26.9811947Z }, 2025-12-04T11:11:26.9812029Z { 2025-12-04T11:11:26.9812162Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T11:11:26.9812324Z "size": 54145661, 2025-12-04T11:11:26.9812494Z "digest": "sha256:1ae00acdac56cbc6d3f81b3c5d854a2b77c30d458b0fbe18c5935145364484f0" 2025-12-04T11:11:26.9812678Z } 2025-12-04T11:11:26.9812763Z ] 2025-12-04T11:11:26.9812848Z } 2025-12-04T11:11:26.9812944Z + exit 0 2025-12-04T11:11:26.9830545Z ##[group]Run set -eux 2025-12-04T11:11:26.9830681Z set -eux 2025-12-04T11:11:26.9830851Z # It's ok if this steps fails, it would then be an anonymous user like what we used to have 2025-12-04T11:11:26.9831275Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin || true 2025-12-04T11:11:26.9836039Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:26.9836193Z env: 2025-12-04T11:11:26.9836292Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:26.9836486Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:26.9836667Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:26.9836839Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:26.9837341Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:26.9837833Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:26.9837955Z AWS_REGION: us-east-1 2025-12-04T11:11:26.9838234Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:26.9838392Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:26.9840438Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:26.9840549Z ##[endgroup] 2025-12-04T11:11:26.9868229Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-12-04T11:11:26.9868808Z /home/runner/_work/_temp/dbe5e399-4f52-4f96-b484-e9ecd25a675b.sh: line 3: aws: command not found 2025-12-04T11:11:26.9869217Z + jq --raw-output .SecretString 2025-12-04T11:11:26.9869547Z + jq -r .docker_hub_readonly_token 2025-12-04T11:11:26.9872453Z + docker login --username pytorchbot --password-stdin 2025-12-04T11:11:26.9984421Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T11:11:26.9992644Z + true 2025-12-04T11:11:27.0056739Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-12-04T11:11:27.0056940Z with: 2025-12-04T11:11:27.0057224Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:27.0057563Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:27.0057880Z env: 2025-12-04T11:11:27.0057987Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:27.0058134Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:27.0058379Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:27.0058555Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:27.0059091Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:27.0059592Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:27.0059717Z AWS_REGION: us-east-1 2025-12-04T11:11:27.0059947Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:27.0060110Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:27.0062143Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:27.0062263Z ##[endgroup] 2025-12-04T11:11:27.0069252Z ##[group]Run set -x 2025-12-04T11:11:27.0069381Z set -x 2025-12-04T11:11:27.0069485Z set +e 2025-12-04T11:11:27.0069587Z  2025-12-04T11:11:27.0069706Z login() { 2025-12-04T11:11:27.0069901Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-12-04T11:11:27.0070100Z } 2025-12-04T11:11:27.0070192Z  2025-12-04T11:11:27.0070286Z retry () { 2025-12-04T11:11:27.0070410Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-12-04T11:11:27.0070546Z } 2025-12-04T11:11:27.0070638Z  2025-12-04T11:11:27.0070740Z retry login "${DOCKER_REGISTRY}" 2025-12-04T11:11:27.0070868Z  2025-12-04T11:11:27.0071057Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-12-04T11:11:27.0071305Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-12-04T11:11:27.0071458Z  2025-12-04T11:11:27.0071550Z set -e 2025-12-04T11:11:27.0071691Z # ignore output since only exit code is used for conditional 2025-12-04T11:11:27.0071881Z # only pull docker image if it's not available locally 2025-12-04T11:11:27.0072092Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-12-04T11:11:27.0072287Z  retry docker pull "${DOCKER_IMAGE}" 2025-12-04T11:11:27.0072418Z fi 2025-12-04T11:11:27.0076715Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:27.0076871Z env: 2025-12-04T11:11:27.0076969Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:27.0077110Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:27.0077292Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:27.0077464Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:27.0077973Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:27.0078519Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:27.0078644Z AWS_REGION: us-east-1 2025-12-04T11:11:27.0078788Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:27.0078948Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:27.0080954Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:27.0081328Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:27.0081654Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:27.0081811Z ##[endgroup] 2025-12-04T11:11:27.0102697Z + set +e 2025-12-04T11:11:27.0102869Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:27.0103133Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:27.0106694Z + aws ecr get-login-password --region us-east-1 2025-12-04T11:11:27.0107184Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:27.0107470Z /home/runner/_work/_temp/01919194-d03e-4bd1-9aa7-72be92403208.sh: line 5: aws: command not found 2025-12-04T11:11:27.0215815Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T11:11:27.0223787Z + sleep 1 2025-12-04T11:11:28.0233607Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:28.0237669Z + aws ecr get-login-password --region us-east-1 2025-12-04T11:11:28.0238514Z /home/runner/_work/_temp/01919194-d03e-4bd1-9aa7-72be92403208.sh: line 5: aws: command not found 2025-12-04T11:11:28.0239393Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:28.0351281Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T11:11:28.0363446Z + sleep 2 2025-12-04T11:11:30.0379146Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:30.0381474Z + aws ecr get-login-password --region us-east-1 2025-12-04T11:11:30.0381943Z /home/runner/_work/_temp/01919194-d03e-4bd1-9aa7-72be92403208.sh: line 5: aws: command not found 2025-12-04T11:11:30.0383570Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T11:11:30.0491324Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T11:11:30.0511618Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:30.0512369Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-12-04T11:11:31.3943208Z + IMAGE_SIZE=18579.916069984436 2025-12-04T11:11:31.3943484Z + echo 'Compressed size of image in MB: 18579.916069984436' 2025-12-04T11:11:31.3943668Z + set -e 2025-12-04T11:11:31.3944049Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:11:31.3944392Z Compressed size of image in MB: 18579.916069984436 2025-12-04T11:11:31.4112805Z Prepare all required actions 2025-12-04T11:11:31.4128077Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-12-04T11:11:31.4128284Z with: 2025-12-04T11:11:31.4128591Z github-token: *** 2025-12-04T11:11:31.4128690Z env: 2025-12-04T11:11:31.4128784Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:31.4128924Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:31.4129109Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:31.4129278Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:31.4129779Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:31.4130273Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:31.4130404Z AWS_REGION: us-east-1 2025-12-04T11:11:31.4130583Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:31.4130737Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:31.4132761Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:31.4132867Z ##[endgroup] 2025-12-04T11:11:31.4139962Z ##[group]Run set -eux 2025-12-04T11:11:31.4140084Z set -eux 2025-12-04T11:11:31.4140256Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-12-04T11:11:31.4144777Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:11:31.4144922Z env: 2025-12-04T11:11:31.4145019Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:31.4145158Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:31.4145336Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:31.4145627Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:31.4146130Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:31.4146619Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:31.4146737Z AWS_REGION: us-east-1 2025-12-04T11:11:31.4146891Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:31.4147054Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:31.4149097Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:31.4149267Z GITHUB_TOKEN: *** 2025-12-04T11:11:31.4149365Z ##[endgroup] 2025-12-04T11:11:31.4168533Z + python3 .github/scripts/get_workflow_job_id.py 19922798714 linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv 2025-12-04T11:11:32.1324740Z Setting output job-id=57117547540 2025-12-04T11:11:32.1325163Z Setting output job-name=linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check) 2025-12-04T11:11:32.1438009Z Prepare all required actions 2025-12-04T11:11:32.1438282Z Getting action download info 2025-12-04T11:11:32.3686402Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-12-04T11:11:33.2265647Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-12-04T11:11:34.0710761Z ##[group]Run ./.github/actions/download-build-artifacts 2025-12-04T11:11:34.0710927Z with: 2025-12-04T11:11:34.0711037Z name: linux-noble-rocm-py3.12-mi300 2025-12-04T11:11:34.0711168Z s3-bucket: gha-artifacts 2025-12-04T11:11:34.0711279Z env: 2025-12-04T11:11:34.0711377Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:34.0711513Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:34.0711707Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:34.0711872Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:34.0712399Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:34.0712892Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:34.0713006Z AWS_REGION: us-east-1 2025-12-04T11:11:34.0713185Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:34.0713334Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:34.0715350Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:34.0715454Z ##[endgroup] 2025-12-04T11:11:34.0730396Z ##[group]Run seemethere/download-artifact-s3@v4 2025-12-04T11:11:34.0730530Z with: 2025-12-04T11:11:34.0730634Z name: linux-noble-rocm-py3.12-mi300 2025-12-04T11:11:34.0730763Z s3-bucket: gha-artifacts 2025-12-04T11:11:34.0730879Z region: us-east-1 2025-12-04T11:11:34.0730974Z env: 2025-12-04T11:11:34.0731065Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:11:34.0731205Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:11:34.0731387Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:11:34.0731558Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:11:34.0732069Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:11:34.0732565Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:11:34.0732687Z AWS_REGION: us-east-1 2025-12-04T11:11:34.0732822Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:11:34.0732971Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:11:34.0735100Z AWS_SESSION_TOKEN: *** 2025-12-04T11:11:34.0735199Z ##[endgroup] 2025-12-04T11:11:34.3064318Z (node:20336) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-12-04T11:11:34.3064548Z 2025-12-04T11:11:34.3065538Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-12-04T11:11:34.3066081Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-12-04T11:11:34.3066530Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-12-04T11:11:34.5823670Z Found 1 objects with prefix pytorch/pytorch/19922798714/linux-noble-rocm-py3.12-mi300/ 2025-12-04T11:11:34.5824386Z Starting download (1/1): /home/runner/_work/pytorch/pytorch/artifacts.zip 2025-12-04T11:12:10.1981185Z Finished download (1/1): /home/runner/_work/pytorch/pytorch/artifacts.zip 2025-12-04T11:12:10.1985102Z Artifact download has finished successfully 2025-12-04T11:12:10.2337806Z ##[group]Run unzip -o artifacts.zip 2025-12-04T11:12:10.2337985Z unzip -o artifacts.zip 2025-12-04T11:12:10.2342803Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:12:10.2342977Z env: 2025-12-04T11:12:10.2343294Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:10.2343452Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:10.2343659Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:10.2343854Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:10.2344451Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:10.2345030Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:10.2345166Z AWS_REGION: us-east-1 2025-12-04T11:12:10.2345345Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:10.2345522Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:10.2347880Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:10.2347989Z ##[endgroup] 2025-12-04T11:12:10.2384312Z Archive: artifacts.zip 2025-12-04T11:12:10.2385732Z creating: dist/ 2025-12-04T11:12:13.1724833Z inflating: dist/torch-2.10.0a0+gitffd9b0f-cp312-cp312-linux_x86_64.whl 2025-12-04T11:12:13.1804110Z inflating: dist/.ninja_log 2025-12-04T11:12:13.1804409Z creating: build/custom_test_artifacts/ 2025-12-04T11:12:13.1808879Z creating: build/custom_test_artifacts/custom-op-build/ 2025-12-04T11:12:13.1809399Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-12-04T11:12:13.1809964Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-12-04T11:12:13.1810577Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T11:12:13.1811171Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/ 2025-12-04T11:12:13.1811761Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T11:12:13.1812434Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T11:12:13.1813042Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T11:12:13.1813745Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T11:12:13.1814446Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T11:12:13.1815103Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T11:12:13.1815739Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T11:12:13.1816354Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T11:12:13.1816906Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T11:12:13.1818139Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T11:12:13.1818678Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T11:12:13.1819211Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T11:12:13.1819778Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T11:12:13.1820265Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-12-04T11:12:13.1820657Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-12-04T11:12:13.1821069Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-12-04T11:12:13.1821604Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-12-04T11:12:13.1822085Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-12-04T11:12:13.1822840Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-12-04T11:12:13.1823347Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-12-04T11:12:13.1823826Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-12-04T11:12:13.1824309Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-12-04T11:12:13.1824803Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-12-04T11:12:13.1825299Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-12-04T11:12:13.1825784Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-12-04T11:12:13.1826283Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-12-04T11:12:13.1830746Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-12-04T11:12:13.1947134Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-12-04T11:12:13.1947495Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.d 2025-12-04T11:12:13.1947920Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-12-04T11:12:13.1948335Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-12-04T11:12:13.1948745Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-12-04T11:12:13.1949123Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-12-04T11:12:13.1949485Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-12-04T11:12:13.1949863Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-12-04T11:12:13.1950240Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-12-04T11:12:13.1950605Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-12-04T11:12:13.1950975Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-12-04T11:12:13.1951337Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-12-04T11:12:13.1962281Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-12-04T11:12:13.2009807Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-12-04T11:12:13.2010257Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.d 2025-12-04T11:12:13.2010603Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T11:12:13.2010927Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-12-04T11:12:13.2011227Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-12-04T11:12:13.2011502Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-12-04T11:12:13.2011861Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-12-04T11:12:13.2012356Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_outer_vec.cc 2025-12-04T11:12:13.2012804Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_vec_ext.cc 2025-12-04T11:12:13.2013349Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-12-04T11:12:13.2013964Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-12-04T11:12:13.2014303Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-12-04T11:12:13.2115622Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-12-04T11:12:13.2149523Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-12-04T11:12:13.2149830Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-12-04T11:12:13.2150123Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-12-04T11:12:13.2150437Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-12-04T11:12:13.2152380Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T11:12:13.2152735Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/ 2025-12-04T11:12:13.2153082Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T11:12:13.2153461Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T11:12:13.2153820Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T11:12:13.2154433Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T11:12:13.2155154Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T11:12:13.2155552Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T11:12:13.2155925Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T11:12:13.2156288Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T11:12:13.2157327Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T11:12:13.2157919Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T11:12:13.2158409Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T11:12:13.2159443Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T11:12:13.2160071Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T11:12:13.2160439Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-12-04T11:12:13.2160738Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-12-04T11:12:13.2161045Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-12-04T11:12:13.2161372Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-12-04T11:12:13.2161828Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-12-04T11:12:13.2162242Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-12-04T11:12:13.2162632Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-12-04T11:12:13.2162995Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-12-04T11:12:13.2163374Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-12-04T11:12:13.2163755Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-12-04T11:12:13.2164137Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-12-04T11:12:13.2164517Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-12-04T11:12:13.2164939Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-12-04T11:12:13.2175245Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-12-04T11:12:13.2212152Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-12-04T11:12:13.2212528Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.d 2025-12-04T11:12:13.2213026Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T11:12:13.2213324Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-12-04T11:12:13.2213586Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-12-04T11:12:13.2213839Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-12-04T11:12:13.2214550Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-12-04T11:12:13.2214805Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_outer_vec.cc 2025-12-04T11:12:13.2215051Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_vec_ext.cc 2025-12-04T11:12:13.2215852Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-12-04T11:12:13.2216156Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-12-04T11:12:13.2216466Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-12-04T11:12:13.2239503Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-12-04T11:12:13.2239722Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-12-04T11:12:13.2239942Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-12-04T11:12:13.2240194Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-12-04T11:12:13.2242493Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T11:12:13.2242768Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/ 2025-12-04T11:12:13.2243033Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T11:12:13.2243320Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T11:12:13.2243599Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T11:12:13.2244571Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T11:12:13.2245309Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T11:12:13.2245700Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T11:12:13.2246004Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T11:12:13.2246287Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T11:12:13.2247362Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T11:12:13.2248061Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T11:12:13.2248506Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T11:12:13.2249445Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T11:12:13.2250180Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T11:12:13.2250486Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-12-04T11:12:13.2250794Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-12-04T11:12:13.2251054Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-12-04T11:12:13.2251321Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-12-04T11:12:13.2251623Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-12-04T11:12:13.2251963Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-12-04T11:12:13.2252287Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-12-04T11:12:13.2252591Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-12-04T11:12:13.2252905Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-12-04T11:12:13.2253227Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-12-04T11:12:13.2253539Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-12-04T11:12:13.2253850Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-12-04T11:12:13.2254161Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-12-04T11:12:13.2255330Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-12-04T11:12:13.2325268Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-12-04T11:12:13.2325594Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.d 2025-12-04T11:12:13.2325909Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-12-04T11:12:13.2326234Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-12-04T11:12:13.2326595Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-12-04T11:12:13.2326941Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-12-04T11:12:13.2327269Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-12-04T11:12:13.2327600Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-12-04T11:12:13.2327936Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-12-04T11:12:13.2328361Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-12-04T11:12:13.2328700Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-12-04T11:12:13.2329029Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-12-04T11:12:13.2340095Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-12-04T11:12:13.2372168Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-12-04T11:12:13.2372533Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.d 2025-12-04T11:12:13.2372866Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T11:12:13.2373241Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-12-04T11:12:13.2373519Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-12-04T11:12:13.2373786Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-12-04T11:12:13.2374421Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-12-04T11:12:13.2374701Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_outer_vec.cc 2025-12-04T11:12:13.2374959Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_vec_ext.cc 2025-12-04T11:12:13.2375767Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-12-04T11:12:13.2376119Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-12-04T11:12:13.2376473Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-12-04T11:12:13.2436706Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-12-04T11:12:13.2460315Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-12-04T11:12:13.2460533Z creating: build/lib/ 2025-12-04T11:12:13.2509838Z inflating: build/lib/libprotobuf-lite.a 2025-12-04T11:12:13.2775216Z inflating: build/lib/libprotobuf.a 2025-12-04T11:12:13.3076483Z inflating: build/lib/libprotoc.a 2025-12-04T11:12:13.3082308Z inflating: build/lib/libpthreadpool.a 2025-12-04T11:12:13.3086582Z inflating: build/lib/libcpuinfo.a 2025-12-04T11:12:13.3091105Z inflating: build/lib/libcpuinfo_internals.a 2025-12-04T11:12:13.3091572Z inflating: build/lib/libclog.a 2025-12-04T11:12:13.3103038Z inflating: build/lib/libpytorch_qnnpack.a 2025-12-04T11:12:13.3104163Z inflating: build/lib/libnnpack_reference_layers.a 2025-12-04T11:12:13.3216897Z inflating: build/lib/libmicrokernels-prod.a 2025-12-04T11:12:13.3227366Z inflating: build/lib/libnnpack.a 2025-12-04T11:12:13.3756875Z inflating: build/lib/libmicrokernels-all.a 2025-12-04T11:12:13.3798127Z inflating: build/lib/libgtest.a 2025-12-04T11:12:13.3808231Z inflating: build/lib/libgmock.a 2025-12-04T11:12:13.3808431Z inflating: build/lib/libgtest_main.a 2025-12-04T11:12:13.3808607Z inflating: build/lib/libgmock_main.a 2025-12-04T11:12:13.3863233Z inflating: build/lib/libXNNPACK.a 2025-12-04T11:12:13.3908695Z inflating: build/lib/libbenchmark.a 2025-12-04T11:12:13.3908915Z inflating: build/lib/libbenchmark_main.a 2025-12-04T11:12:13.3948903Z inflating: build/lib/libasmjit.a 2025-12-04T11:12:13.3949116Z inflating: build/lib/libjitprofiling.a 2025-12-04T11:12:13.3953798Z inflating: build/lib/libittnotify.a 2025-12-04T11:12:13.4646331Z inflating: build/lib/libfbgemm.a 2025-12-04T11:12:13.4664583Z inflating: build/lib/libtensorpipe_uv.a 2025-12-04T11:12:13.4988644Z inflating: build/lib/libtensorpipe.a 2025-12-04T11:12:13.5061173Z inflating: build/lib/libgloo.a 2025-12-04T11:12:13.5089055Z inflating: build/lib/libonnx_proto.a 2025-12-04T11:12:13.5335477Z inflating: build/lib/libgloo_hip.a 2025-12-04T11:12:13.5761591Z inflating: build/lib/libonnx.a 2025-12-04T11:12:14.1798338Z inflating: build/lib/libdnnl.a 2025-12-04T11:12:14.1809494Z inflating: build/lib/libfmt.a 2025-12-04T11:12:14.1996032Z inflating: build/lib/libkineto.a 2025-12-04T11:12:14.2066558Z inflating: build/lib/libc10.so 2025-12-04T11:12:14.2067083Z inflating: build/lib/libtorch_global_deps.so 2025-12-04T11:12:14.2067863Z inflating: build/lib/libcaffe2_nvrtc.so 2025-12-04T11:12:14.2094999Z inflating: build/lib/libc10_hip.so 2025-12-04T11:12:14.2380117Z inflating: build/lib/libfbgemm_genai.a 2025-12-04T11:12:16.0971281Z inflating: build/lib/libtorch_cpu.so 2025-12-04T11:12:16.0973629Z inflating: build/lib/libshm.so 2025-12-04T11:12:16.9508609Z inflating: build/lib/libtorch_hip.so 2025-12-04T11:12:16.9509086Z inflating: build/lib/libtorch.so 2025-12-04T11:12:16.9520725Z inflating: build/lib/libjitbackend_test.so 2025-12-04T11:12:16.9534755Z inflating: build/lib/libbackend_with_compiler.so 2025-12-04T11:12:16.9577585Z inflating: build/lib/libtorchbind_test.so 2025-12-04T11:12:16.9593429Z inflating: build/lib/libaoti_custom_ops.so 2025-12-04T11:12:17.1040321Z inflating: build/lib/libtorch_python.so 2025-12-04T11:12:17.1062247Z inflating: build/lib/libnnapi_backend.so 2025-12-04T11:12:17.1062444Z creating: build/bin/ 2025-12-04T11:12:17.1062596Z creating: build/bin/CMakeFiles/ 2025-12-04T11:12:17.1062774Z inflating: build/bin/cmake_install.cmake 2025-12-04T11:12:17.1062962Z inflating: build/bin/CTestTestfile.cmake 2025-12-04T11:12:17.1341180Z inflating: build/bin/protoc-3.13.0.0 2025-12-04T11:12:17.1619396Z inflating: build/bin/protoc 2025-12-04T11:12:17.1655390Z inflating: build/bin/c10_AllocatorConfig_test 2025-12-04T11:12:17.1689339Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-12-04T11:12:17.1724022Z inflating: build/bin/c10_DeviceGuard_test 2025-12-04T11:12:17.1758863Z inflating: build/bin/c10_Device_test 2025-12-04T11:12:17.1792121Z inflating: build/bin/c10_StreamGuard_test 2025-12-04T11:12:17.1828774Z inflating: build/bin/c10_Scalar_test 2025-12-04T11:12:17.1868787Z inflating: build/bin/c10_DispatchKeySet_test 2025-12-04T11:12:17.1905291Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-12-04T11:12:17.1943515Z inflating: build/bin/c10_SymInt_test 2025-12-04T11:12:17.1982068Z inflating: build/bin/c10_InlineStreamGuard_test 2025-12-04T11:12:17.2019275Z inflating: build/bin/c10_SizesAndStrides_test 2025-12-04T11:12:17.2053088Z inflating: build/bin/c10_ArrayRef_test 2025-12-04T11:12:17.2099750Z inflating: build/bin/c10_cow_test 2025-12-04T11:12:17.2133166Z inflating: build/bin/c10_ConstexprCrc_test 2025-12-04T11:12:17.2166903Z inflating: build/bin/c10_DeadlockDetection_test 2025-12-04T11:12:17.2205278Z inflating: build/bin/c10_Enumerate_test 2025-12-04T11:12:17.2240944Z inflating: build/bin/c10_IntrusiveList_test 2025-12-04T11:12:17.2275533Z inflating: build/bin/c10_Half_test 2025-12-04T11:12:17.2311284Z inflating: build/bin/c10_Bitset_test 2025-12-04T11:12:17.2348987Z inflating: build/bin/c10_LeftRight_test 2025-12-04T11:12:17.2382796Z inflating: build/bin/c10_Semaphore_test 2025-12-04T11:12:17.2418928Z inflating: build/bin/c10_NetworkFlow_test 2025-12-04T11:12:17.2456297Z inflating: build/bin/c10_ThreadLocal_test 2025-12-04T11:12:17.2490522Z inflating: build/bin/c10_Synchronized_test 2025-12-04T11:12:17.2525572Z inflating: build/bin/c10_TypeIndex_test 2025-12-04T11:12:17.2560533Z inflating: build/bin/c10_accumulate_test 2025-12-04T11:12:17.2594175Z inflating: build/bin/c10_error_test 2025-12-04T11:12:17.2628393Z inflating: build/bin/c10_bit_cast_test 2025-12-04T11:12:17.2666067Z inflating: build/bin/c10_bfloat16_test 2025-12-04T11:12:17.2703375Z inflating: build/bin/c10_complex_test 2025-12-04T11:12:17.2738875Z inflating: build/bin/c10_exception_test 2025-12-04T11:12:17.2776883Z inflating: build/bin/c10_complex_math_test 2025-12-04T11:12:17.2811086Z inflating: build/bin/c10_flags_test 2025-12-04T11:12:17.2845223Z inflating: build/bin/c10_generic_math_test 2025-12-04T11:12:17.2879759Z inflating: build/bin/c10_irange_test 2025-12-04T11:12:17.2979810Z inflating: build/bin/c10_intrusive_ptr_test 2025-12-04T11:12:17.3016047Z inflating: build/bin/c10_lazy_test 2025-12-04T11:12:17.3054534Z inflating: build/bin/c10_logging_test 2025-12-04T11:12:17.3088447Z inflating: build/bin/c10_nofatal_test 2025-12-04T11:12:17.3138060Z inflating: build/bin/c10_optional_test 2025-12-04T11:12:17.3174039Z inflating: build/bin/c10_registry_test 2025-12-04T11:12:17.3215393Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-12-04T11:12:17.3313195Z inflating: build/bin/c10_small_vector_test 2025-12-04T11:12:17.3348537Z inflating: build/bin/c10_ssize_test 2025-12-04T11:12:17.3386267Z inflating: build/bin/c10_string_util_test 2025-12-04T11:12:17.3419603Z inflating: build/bin/c10_string_view_test 2025-12-04T11:12:17.3449347Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-12-04T11:12:17.3483700Z inflating: build/bin/c10_tempfile_test 2025-12-04T11:12:17.3521629Z inflating: build/bin/c10_typeid_test 2025-12-04T11:12:17.3554921Z inflating: build/bin/c10_hip_HIPAssertionsTest_1_var_test 2025-12-04T11:12:17.3588253Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_stream 2025-12-04T11:12:17.3621641Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_thread_and_block_and_device 2025-12-04T11:12:17.3654871Z inflating: build/bin/c10_hip_HIPAssertionsTest_from_2_processes 2025-12-04T11:12:17.3688078Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_blocks_and_threads 2025-12-04T11:12:17.3721685Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_multiple_blocks 2025-12-04T11:12:17.3755207Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_same_block 2025-12-04T11:12:17.3790734Z inflating: build/bin/c10_hip_HIPTest 2025-12-04T11:12:17.4155459Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-12-04T11:12:17.4529335Z inflating: build/bin/vec_test_all_types_AVX512 2025-12-04T11:12:17.4909341Z inflating: build/bin/vec_test_all_types_AVX2 2025-12-04T11:12:17.4972961Z inflating: build/bin/test_aoti_abi_check 2025-12-04T11:12:17.5006475Z inflating: build/bin/test_vec_half_DEFAULT 2025-12-04T11:12:17.5040399Z inflating: build/bin/test_vec_half_AVX2 2025-12-04T11:12:17.5074273Z inflating: build/bin/test_vec_half_AVX512 2025-12-04T11:12:17.5109720Z inflating: build/bin/BackoffTest 2025-12-04T11:12:17.5145637Z inflating: build/bin/FileStoreTest 2025-12-04T11:12:17.5183759Z inflating: build/bin/TCPStoreTest 2025-12-04T11:12:17.5220337Z inflating: build/bin/HashStoreTest 2025-12-04T11:12:17.5264950Z inflating: build/bin/ProcessGroupGlooTest 2025-12-04T11:12:17.5266680Z inflating: build/bin/example_allreduce 2025-12-04T11:12:17.5268739Z inflating: build/bin/torch_shm_manager 2025-12-04T11:12:17.5305227Z inflating: build/bin/static_runtime_bench 2025-12-04T11:12:17.5464742Z inflating: build/bin/static_runtime_test 2025-12-04T11:12:17.5513404Z inflating: build/bin/Dict_test 2025-12-04T11:12:17.5548505Z inflating: build/bin/Dimname_test 2025-12-04T11:12:17.5591979Z inflating: build/bin/MaybeOwned_test 2025-12-04T11:12:17.5630336Z inflating: build/bin/NamedTensor_test 2025-12-04T11:12:17.5669873Z inflating: build/bin/apply_utils_test 2025-12-04T11:12:17.5709287Z inflating: build/bin/atest 2025-12-04T11:12:17.5752127Z inflating: build/bin/basic 2025-12-04T11:12:17.5788845Z inflating: build/bin/broadcast_test 2025-12-04T11:12:17.5823215Z inflating: build/bin/cpu_allocator_test 2025-12-04T11:12:17.5862120Z inflating: build/bin/cpu_generator_test 2025-12-04T11:12:17.5897826Z inflating: build/bin/cpu_profiling_allocator_test 2025-12-04T11:12:17.5958594Z inflating: build/bin/cpu_rng_test 2025-12-04T11:12:17.5993556Z inflating: build/bin/dlconvertor_test 2025-12-04T11:12:17.6149113Z inflating: build/bin/extension_backend_test 2025-12-04T11:12:17.6186341Z inflating: build/bin/half_test 2025-12-04T11:12:17.6250432Z inflating: build/bin/ivalue_test 2025-12-04T11:12:17.6284382Z inflating: build/bin/lazy_tensor_test 2025-12-04T11:12:17.6320107Z inflating: build/bin/math_kernel_test 2025-12-04T11:12:17.6356068Z inflating: build/bin/memory_format_test 2025-12-04T11:12:17.6392375Z inflating: build/bin/memory_overlapping_test 2025-12-04T11:12:17.6427002Z inflating: build/bin/operator_name_test 2025-12-04T11:12:17.6463021Z inflating: build/bin/mobile_memory_cleanup 2025-12-04T11:12:17.6500525Z inflating: build/bin/native_test 2025-12-04T11:12:17.6535925Z inflating: build/bin/packedtensoraccessor_test 2025-12-04T11:12:17.6570491Z inflating: build/bin/operators_test 2025-12-04T11:12:17.6619474Z inflating: build/bin/pow_test 2025-12-04T11:12:17.6657553Z inflating: build/bin/quantized_test 2025-12-04T11:12:17.6692155Z inflating: build/bin/reportMemoryUsage_test 2025-12-04T11:12:17.6730047Z inflating: build/bin/reduce_ops_test 2025-12-04T11:12:17.6764842Z inflating: build/bin/StorageUtils_test 2025-12-04T11:12:17.6803402Z inflating: build/bin/scalar_test 2025-12-04T11:12:17.6841279Z inflating: build/bin/scalar_tensor_test 2025-12-04T11:12:17.6879596Z inflating: build/bin/stride_properties_test 2025-12-04T11:12:17.6931950Z inflating: build/bin/tensor_iterator_test 2025-12-04T11:12:17.6969073Z inflating: build/bin/test_parallel 2025-12-04T11:12:17.7006433Z inflating: build/bin/type_ptr_test 2025-12-04T11:12:17.7045070Z inflating: build/bin/thread_init_test 2025-12-04T11:12:17.7080657Z inflating: build/bin/undefined_tensor_test 2025-12-04T11:12:17.7120392Z inflating: build/bin/type_test 2025-12-04T11:12:17.7154001Z inflating: build/bin/verify_api_visibility 2025-12-04T11:12:17.7189097Z inflating: build/bin/weakref_test 2025-12-04T11:12:17.7236617Z inflating: build/bin/legacy_vmap_test 2025-12-04T11:12:17.7271858Z inflating: build/bin/wrapdim_test 2025-12-04T11:12:17.7312172Z inflating: build/bin/IListRef_test 2025-12-04T11:12:17.7346865Z inflating: build/bin/xla_tensor_test 2025-12-04T11:12:17.7415386Z inflating: build/bin/List_test 2025-12-04T11:12:17.7493929Z inflating: build/bin/kernel_function_legacy_test 2025-12-04T11:12:17.7556546Z inflating: build/bin/kernel_function_test 2025-12-04T11:12:17.7600702Z inflating: build/bin/KernelFunction_test 2025-12-04T11:12:17.7682061Z inflating: build/bin/kernel_lambda_legacy_test 2025-12-04T11:12:17.7748408Z inflating: build/bin/kernel_lambda_test 2025-12-04T11:12:17.7811111Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-12-04T11:12:17.7851372Z inflating: build/bin/kernel_stackbased_test 2025-12-04T11:12:17.7886010Z inflating: build/bin/CppSignature_test 2025-12-04T11:12:17.7919500Z inflating: build/bin/op_allowlist_test 2025-12-04T11:12:17.8113881Z inflating: build/bin/op_registration_test 2025-12-04T11:12:17.8147053Z inflating: build/bin/hip_complex_math_test 2025-12-04T11:12:17.8191634Z inflating: build/bin/inline_container_test 2025-12-04T11:12:17.8228745Z inflating: build/bin/backend_fallback_test 2025-12-04T11:12:17.8264650Z inflating: build/bin/hip_apply_test 2025-12-04T11:12:17.8298198Z inflating: build/bin/hip_complex_test 2025-12-04T11:12:17.8331496Z inflating: build/bin/hip_distributions_test 2025-12-04T11:12:17.8364704Z inflating: build/bin/hip_generator_test 2025-12-04T11:12:17.8397864Z inflating: build/bin/hip_half_test 2025-12-04T11:12:17.8431080Z inflating: build/bin/hip_integer_divider_test 2025-12-04T11:12:17.8464312Z inflating: build/bin/hip_optional_test 2025-12-04T11:12:17.8497667Z inflating: build/bin/hip_packedtensoraccessor_test 2025-12-04T11:12:17.8533044Z inflating: build/bin/hip_dlconvertor_test 2025-12-04T11:12:17.8566212Z inflating: build/bin/hip_vectorized_test 2025-12-04T11:12:17.9317903Z inflating: build/bin/test_jit 2025-12-04T11:12:17.9537909Z inflating: build/bin/test_lazy 2025-12-04T11:12:17.9575144Z inflating: build/bin/test_dist_autograd 2025-12-04T11:12:17.9620861Z inflating: build/bin/test_cpp_rpc 2025-12-04T11:12:17.9622349Z inflating: build/bin/parallel_benchmark 2025-12-04T11:12:18.0353761Z inflating: build/bin/test_api 2025-12-04T11:12:18.0354166Z creating: .additional_ci_files/ 2025-12-04T11:12:18.0392859Z inflating: .additional_ci_files/test-times.json 2025-12-04T11:12:18.0536253Z inflating: .additional_ci_files/test-class-times.json 2025-12-04T11:12:18.0562532Z ##[group]Run rm artifacts.zip 2025-12-04T11:12:18.0562722Z rm artifacts.zip 2025-12-04T11:12:18.0567958Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:12:18.0568125Z env: 2025-12-04T11:12:18.0568285Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:18.0568433Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:18.0568625Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:18.0568808Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:18.0569346Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:18.0569875Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:18.0570001Z AWS_REGION: us-east-1 2025-12-04T11:12:18.0570196Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:18.0570366Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:18.0572521Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:18.0572642Z ##[endgroup] 2025-12-04T11:12:18.1699888Z ##[group]Run df -H 2025-12-04T11:12:18.1700087Z df -H 2025-12-04T11:12:18.1705881Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:12:18.1706064Z env: 2025-12-04T11:12:18.1706183Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:18.1706355Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:18.1706572Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:18.1706774Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:18.1707361Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:18.1707962Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:18.1708115Z AWS_REGION: us-east-1 2025-12-04T11:12:18.1708565Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:18.1708746Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:18.1711094Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:18.1711230Z ##[endgroup] 2025-12-04T11:12:18.2430017Z Filesystem Size Used Avail Use% Mounted on 2025-12-04T11:12:18.2430236Z overlay 16T 544G 15T 4% / 2025-12-04T11:12:18.2430381Z tmpfs 68M 0 68M 0% /dev 2025-12-04T11:12:18.2430524Z /dev/md0 16T 544G 15T 4% /run 2025-12-04T11:12:18.2430667Z shm 68M 17k 68M 1% /dev/shm 2025-12-04T11:12:18.2430839Z amdprj2-k8s_2 5.5T 120G 5.4T 3% /home/runner/pytorch-data 2025-12-04T11:12:18.2431036Z tmpfs 3.3T 13k 3.3T 1% /run/secrets/kubernetes.io/serviceaccount 2025-12-04T11:12:18.2431204Z tmpfs 1.7T 0 1.7T 0% /proc/acpi 2025-12-04T11:12:18.2431799Z tmpfs 1.7T 0 1.7T 0% /proc/scsi 2025-12-04T11:12:18.2432054Z tmpfs 1.7T 0 1.7T 0% /sys/firmware 2025-12-04T11:12:18.2432265Z tmpfs 1.7T 0 1.7T 0% /sys/devices/virtual/powercap 2025-12-04T11:12:18.2462600Z Prepare all required actions 2025-12-04T11:12:18.2462853Z Getting action download info 2025-12-04T11:12:18.4637760Z ##[group]Run ./.github/actions/download-td-artifacts 2025-12-04T11:12:18.4637902Z with: 2025-12-04T11:12:18.4637994Z env: 2025-12-04T11:12:18.4638089Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:18.4638296Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:18.4638470Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:18.4638636Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:18.4639141Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:18.4639640Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:18.4639772Z AWS_REGION: us-east-1 2025-12-04T11:12:18.4639950Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:18.4640097Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:18.4642103Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:18.4642205Z ##[endgroup] 2025-12-04T11:12:18.4654937Z ##[group]Run seemethere/download-artifact-s3@v4 2025-12-04T11:12:18.4655076Z with: 2025-12-04T11:12:18.4655174Z name: td_results 2025-12-04T11:12:18.4655276Z s3-bucket: gha-artifacts 2025-12-04T11:12:18.4655384Z region: us-east-1 2025-12-04T11:12:18.4655481Z env: 2025-12-04T11:12:18.4655572Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:18.4655710Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:18.4655886Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:18.4656057Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:18.4656561Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:18.4657055Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:18.4657172Z AWS_REGION: us-east-1 2025-12-04T11:12:18.4657303Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:18.4657451Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:18.4659495Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:18.4659604Z ##[endgroup] 2025-12-04T11:12:18.6977317Z (node:20367) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-12-04T11:12:18.6978245Z 2025-12-04T11:12:18.6979146Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-12-04T11:12:18.6979887Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-12-04T11:12:18.6980479Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-12-04T11:12:18.9725362Z Found 1 objects with prefix pytorch/pytorch/19922798714/td_results/ 2025-12-04T11:12:18.9725755Z Starting download (1/1): /home/runner/_work/pytorch/pytorch/td_results.json 2025-12-04T11:12:19.4355662Z Finished download (1/1): /home/runner/_work/pytorch/pytorch/td_results.json 2025-12-04T11:12:19.4359652Z Artifact download has finished successfully 2025-12-04T11:12:19.4556195Z ##[group]Run mkdir -p .additional_ci_files 2025-12-04T11:12:19.4556414Z mkdir -p .additional_ci_files 2025-12-04T11:12:19.4556626Z mv td_results.json .additional_ci_files/td_results.json || true 2025-12-04T11:12:19.4561546Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:12:19.4561734Z env: 2025-12-04T11:12:19.4561852Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:19.4562181Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:19.4562402Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:19.4562615Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:19.4563385Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:19.4563890Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:19.4564012Z AWS_REGION: us-east-1 2025-12-04T11:12:19.4564286Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:19.4564448Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:19.4566459Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:19.4566571Z ##[endgroup] 2025-12-04T11:12:19.4640822Z ##[group]Run .github/scripts/parse_ref.py 2025-12-04T11:12:19.4641055Z .github/scripts/parse_ref.py 2025-12-04T11:12:19.4645743Z shell: /usr/bin/bash -e {0} 2025-12-04T11:12:19.4645864Z env: 2025-12-04T11:12:19.4645971Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:19.4646116Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:19.4646305Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:19.4646481Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:19.4647000Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:19.4647505Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:19.4647630Z AWS_REGION: us-east-1 2025-12-04T11:12:19.4647836Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:19.4647995Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:19.4650219Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:19.4650336Z ##[endgroup] 2025-12-04T11:12:19.4760600Z Setting output branch=main 2025-12-04T11:12:19.4831662Z Prepare all required actions 2025-12-04T11:12:19.4831880Z Getting action download info 2025-12-04T11:12:19.6742867Z ##[group]Run ./.github/actions/filter-test-configs 2025-12-04T11:12:19.6743037Z with: 2025-12-04T11:12:19.6743286Z github-token: *** 2025-12-04T11:12:19.6744643Z test-matrix: {"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}]} 2025-12-04T11:12:19.6746211Z job-name: linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check) 2025-12-04T11:12:19.6746471Z env: 2025-12-04T11:12:19.6746579Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:19.6746730Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:19.6746923Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:19.6747245Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:19.6747767Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:19.6748337Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:19.6748466Z AWS_REGION: us-east-1 2025-12-04T11:12:19.6748607Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:19.6748769Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:19.6750806Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:19.6750922Z ##[endgroup] 2025-12-04T11:12:19.6769039Z ##[group]Run nick-fields/retry@v3.0.0 2025-12-04T11:12:19.6769177Z with: 2025-12-04T11:12:19.6769270Z shell: bash 2025-12-04T11:12:19.6769367Z timeout_minutes: 10 2025-12-04T11:12:19.6769472Z max_attempts: 5 2025-12-04T11:12:19.6769574Z retry_wait_seconds: 30 2025-12-04T11:12:19.6769889Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-12-04T11:12:19.6770190Z polling_interval_seconds: 1 2025-12-04T11:12:19.6770304Z warning_on_retry: true 2025-12-04T11:12:19.6770411Z continue_on_error: false 2025-12-04T11:12:19.6770519Z env: 2025-12-04T11:12:19.6770610Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:19.6770743Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:19.6770925Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:19.6771123Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:19.6771640Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:19.6772311Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:19.6772433Z AWS_REGION: us-east-1 2025-12-04T11:12:19.6772583Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:19.6772739Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:19.6774770Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:19.6774954Z GITHUB_TOKEN: *** 2025-12-04T11:12:19.6775061Z ##[endgroup] 2025-12-04T11:12:19.7161877Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-12-04T11:12:19.8575120Z Defaulting to user installation because normal site-packages is not writeable 2025-12-04T11:12:20.0136127Z Collecting requests==2.27.1 2025-12-04T11:12:20.0468730Z Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB) 2025-12-04T11:12:20.0775047Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.1/63.1 KB 1.8 MB/s eta 0:00:00 2025-12-04T11:12:20.1273888Z Collecting pyyaml==6.0.2 2025-12-04T11:12:20.1329331Z Downloading PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB) 2025-12-04T11:12:20.1875506Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 751.2/751.2 KB 14.2 MB/s eta 0:00:00 2025-12-04T11:12:20.2083792Z Collecting idna<4,>=2.5 2025-12-04T11:12:20.2139671Z Downloading idna-3.11-py3-none-any.whl (71 kB) 2025-12-04T11:12:20.2168519Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 71.0/71.0 KB 33.2 MB/s eta 0:00:00 2025-12-04T11:12:20.2357145Z Collecting certifi>=2017.4.17 2025-12-04T11:12:20.2411958Z Downloading certifi-2025.11.12-py3-none-any.whl (159 kB) 2025-12-04T11:12:20.2476360Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 159.4/159.4 KB 28.0 MB/s eta 0:00:00 2025-12-04T11:12:20.2759652Z Collecting urllib3<1.27,>=1.21.1 2025-12-04T11:12:20.2814039Z Downloading urllib3-1.26.20-py2.py3-none-any.whl (144 kB) 2025-12-04T11:12:20.2873357Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 144.2/144.2 KB 27.9 MB/s eta 0:00:00 2025-12-04T11:12:20.3775137Z Collecting charset-normalizer~=2.0.0 2025-12-04T11:12:20.3831605Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2025-12-04T11:12:20.4361432Z Installing collected packages: urllib3, pyyaml, idna, charset-normalizer, certifi, requests 2025-12-04T11:12:20.5286723Z WARNING: The script normalizer is installed in '/home/runner/.local/bin' which is not on PATH. 2025-12-04T11:12:20.5287058Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-12-04T11:12:20.5457740Z Successfully installed certifi-2025.11.12 charset-normalizer-2.0.12 idna-3.11 pyyaml-6.0.2 requests-2.27.1 urllib3-1.26.20 2025-12-04T11:12:20.7163995Z Command completed after 1 attempt(s). 2025-12-04T11:12:20.7214829Z ##[group]Run set -x 2025-12-04T11:12:20.7215062Z set -x 2025-12-04T11:12:20.7215217Z  2025-12-04T11:12:20.7215473Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-12-04T11:12:20.7215778Z # in runner workspace 2025-12-04T11:12:20.7216035Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-12-04T11:12:20.7222200Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:12:20.7222400Z env: 2025-12-04T11:12:20.7222526Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:20.7222702Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:20.7222920Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:20.7223132Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:20.7223781Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:20.7224414Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:20.7224564Z AWS_REGION: us-east-1 2025-12-04T11:12:20.7224814Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:20.7225018Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:20.7227538Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:20.7227846Z ##[endgroup] 2025-12-04T11:12:20.7244536Z + python3 /home/runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-12-04T11:12:20.7321876Z Setting output branch=main 2025-12-04T11:12:20.7352738Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-12-04T11:12:20.7352912Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-12-04T11:12:20.7353047Z echo "Job name: ${JOB_NAME}" 2025-12-04T11:12:20.7353166Z  2025-12-04T11:12:20.7353315Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-12-04T11:12:20.7353497Z # in runner workspace 2025-12-04T11:12:20.7353663Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-12-04T11:12:20.7353847Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-12-04T11:12:20.7353976Z  --job-name "${JOB_NAME}" \ 2025-12-04T11:12:20.7355309Z  --test-matrix "{"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}]}" \ 2025-12-04T11:12:20.7356792Z  --selected-test-configs "" \ 2025-12-04T11:12:20.7356924Z  --pr-number "${PR_NUMBER}" \ 2025-12-04T11:12:20.7357048Z  --tag "${TAG}" \ 2025-12-04T11:12:20.7357166Z  --event-name "${EVENT_NAME}" \ 2025-12-04T11:12:20.7357289Z  --schedule "${SCHEDULE}" \ 2025-12-04T11:12:20.7357412Z  --branch "${HEAD_BRANCH}" 2025-12-04T11:12:20.7361958Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:12:20.7362107Z env: 2025-12-04T11:12:20.7362200Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:20.7362333Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:20.7362509Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:20.7362674Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:20.7363185Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:20.7363672Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:20.7363788Z AWS_REGION: us-east-1 2025-12-04T11:12:20.7363950Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:20.7364098Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:20.7366089Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:20.7366276Z GITHUB_TOKEN: *** 2025-12-04T11:12:20.7366511Z JOB_NAME: linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check) 2025-12-04T11:12:20.7366755Z PR_NUMBER: 2025-12-04T11:12:20.7366844Z TAG: 2025-12-04T11:12:20.7366931Z EVENT_NAME: schedule 2025-12-04T11:12:20.7367032Z SCHEDULE: 29 8 * * * 2025-12-04T11:12:20.7367134Z HEAD_BRANCH: main 2025-12-04T11:12:20.7367232Z ##[endgroup] 2025-12-04T11:12:20.7381749Z Workflow: periodic-rocm-mi300 2025-12-04T11:12:20.7382014Z Job name: linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check) 2025-12-04T11:12:21.2969611Z Setting output keep-going=True 2025-12-04T11:12:21.2970091Z Setting output ci-verbose-test-logs=False 2025-12-04T11:12:21.2970486Z Setting output ci-test-showlocals=False 2025-12-04T11:12:21.2970867Z Setting output ci-no-test-timeout=False 2025-12-04T11:12:21.2971219Z Setting output ci-no-td=False 2025-12-04T11:12:21.2971556Z Setting output ci-td-distributed=False 2025-12-04T11:12:21.2971914Z Setting output is-unstable=False 2025-12-04T11:12:21.2972247Z Setting output reenabled-issues= 2025-12-04T11:12:21.2978462Z Setting output test-matrix={"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}]} 2025-12-04T11:12:21.2983839Z Setting output is-test-matrix-empty=False 2025-12-04T11:12:21.3060548Z ##[group]Run echo "Filtered matrix:" 2025-12-04T11:12:21.3060798Z echo "Filtered matrix:" 2025-12-04T11:12:21.3064425Z echo "{"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}]}" 2025-12-04T11:12:21.3067787Z  2025-12-04T11:12:21.3067895Z echo 2025-12-04T11:12:21.3068041Z echo "Is the current job unstable? False" 2025-12-04T11:12:21.3068243Z  2025-12-04T11:12:21.3068343Z echo 2025-12-04T11:12:21.3068472Z echo "Is keep-going label set? True" 2025-12-04T11:12:21.3068623Z  2025-12-04T11:12:21.3068725Z echo 2025-12-04T11:12:21.3068842Z echo "Reenabled issues? " 2025-12-04T11:12:21.3073337Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:12:21.3073492Z env: 2025-12-04T11:12:21.3073593Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:21.3073735Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:21.3073916Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:21.3074088Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:21.3074602Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:21.3075098Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:21.3075220Z AWS_REGION: us-east-1 2025-12-04T11:12:21.3075406Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:21.3075624Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:21.3077795Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:21.3077910Z ##[endgroup] 2025-12-04T11:12:21.3103066Z Filtered matrix: 2025-12-04T11:12:21.3106118Z {include: [{config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests}]} 2025-12-04T11:12:21.3109001Z 2025-12-04T11:12:21.3109055Z Is the current job unstable? False 2025-12-04T11:12:21.3109143Z 2025-12-04T11:12:21.3109197Z Is keep-going label set? True 2025-12-04T11:12:21.3109424Z 2025-12-04T11:12:21.3109467Z Reenabled issues? 2025-12-04T11:12:21.3140521Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-12-04T11:12:21.3140915Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-12-04T11:12:21.3144964Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:12:21.3145192Z env: 2025-12-04T11:12:21.3145337Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:21.3145544Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:21.3145814Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:21.3146068Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:21.3146814Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:21.3147590Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:21.3147776Z AWS_REGION: us-east-1 2025-12-04T11:12:21.3147980Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:21.3148264Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:21.3151267Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:21.3151430Z JOB_TIMEOUT: 600 2025-12-04T11:12:21.3151580Z ##[endgroup] 2025-12-04T11:12:21.3184601Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T11:12:21.3184827Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T11:12:21.3185024Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T11:12:21.3189695Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T11:12:21.3189848Z env: 2025-12-04T11:12:21.3189945Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:21.3190083Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:21.3190265Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:21.3190446Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:21.3190967Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:21.3191465Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:21.3191590Z AWS_REGION: us-east-1 2025-12-04T11:12:21.3191762Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:21.3191923Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:21.3193942Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:21.3194053Z ##[endgroup] 2025-12-04T11:12:21.3268402Z ##[group]Run set -x 2025-12-04T11:12:21.3268572Z set -x 2025-12-04T11:12:21.3268675Z  2025-12-04T11:12:21.3268797Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-12-04T11:12:21.3268965Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-12-04T11:12:21.3269141Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-12-04T11:12:21.3269292Z  TEST_COMMAND=.ci/caffe2/test.sh 2025-12-04T11:12:21.3269421Z else 2025-12-04T11:12:21.3269536Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-12-04T11:12:21.3269661Z fi 2025-12-04T11:12:21.3269755Z  2025-12-04T11:12:21.3269898Z # detached container should get cleaned up by teardown_ec2_linux 2025-12-04T11:12:21.3270110Z # TODO: Stop building test binaries as part of the build phase 2025-12-04T11:12:21.3270294Z # Used for GPU_FLAG since that doesn't play nice 2025-12-04T11:12:21.3270471Z # shellcheck disable=SC2086,SC2090 2025-12-04T11:12:21.3270612Z container_name=$(docker run \ 2025-12-04T11:12:21.3270743Z  ${GPU_FLAG:-} \ 2025-12-04T11:12:21.3270865Z  -e BUILD_ENVIRONMENT \ 2025-12-04T11:12:21.3270993Z  -e PR_NUMBER \ 2025-12-04T11:12:21.3271240Z  -e GITHUB_ACTIONS \ 2025-12-04T11:12:21.3271365Z  -e GITHUB_REPOSITORY \ 2025-12-04T11:12:21.3271492Z  -e GITHUB_WORKFLOW \ 2025-12-04T11:12:21.3271610Z  -e GITHUB_JOB \ 2025-12-04T11:12:21.3271729Z  -e GITHUB_RUN_ID \ 2025-12-04T11:12:21.3271848Z  -e GITHUB_RUN_NUMBER \ 2025-12-04T11:12:21.3271968Z  -e GITHUB_RUN_ATTEMPT \ 2025-12-04T11:12:21.3272087Z  -e JOB_ID \ 2025-12-04T11:12:21.3272197Z  -e JOB_NAME \ 2025-12-04T11:12:21.3272309Z  -e BASE_SHA \ 2025-12-04T11:12:21.3272416Z  -e BRANCH \ 2025-12-04T11:12:21.3272519Z  -e SHA1 \ 2025-12-04T11:12:21.3272631Z  -e AWS_DEFAULT_REGION \ 2025-12-04T11:12:21.3272750Z  -e IN_WHEEL_TEST \ 2025-12-04T11:12:21.3272864Z  -e SHARD_NUMBER \ 2025-12-04T11:12:21.3272975Z  -e TEST_CONFIG \ 2025-12-04T11:12:21.3273094Z  -e NUM_TEST_SHARDS \ 2025-12-04T11:12:21.3273219Z  -e REENABLED_ISSUES \ 2025-12-04T11:12:21.3273347Z  -e CONTINUE_THROUGH_ERROR \ 2025-12-04T11:12:21.3273480Z  -e VERBOSE_TEST_LOGS \ 2025-12-04T11:12:21.3273608Z  -e TEST_SHOWLOCALS \ 2025-12-04T11:12:21.3273731Z  -e NO_TEST_TIMEOUT \ 2025-12-04T11:12:21.3273850Z  -e NO_TD \ 2025-12-04T11:12:21.3273974Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-12-04T11:12:21.3274127Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-12-04T11:12:21.3274279Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-12-04T11:12:21.3274423Z  -e TESTS_TO_INCLUDE \ 2025-12-04T11:12:21.3274553Z  -e HUGGING_FACE_HUB_TOKEN \ 2025-12-04T11:12:21.3274689Z  -e DASHBOARD_TAG \ 2025-12-04T11:12:21.3274847Z  --env-file="${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" \ 2025-12-04T11:12:21.3275019Z  --ulimit stack=10485760:83886080 \ 2025-12-04T11:12:21.3275155Z  --ulimit core=0 \ 2025-12-04T11:12:21.3275308Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2025-12-04T11:12:21.3275474Z  --security-opt seccomp=unconfined \ 2025-12-04T11:12:21.3275622Z  --cap-add=SYS_PTRACE \ 2025-12-04T11:12:21.3275752Z  --shm-size="8g" \ 2025-12-04T11:12:21.3275872Z  --tty \ 2025-12-04T11:12:21.3275981Z  --detach \ 2025-12-04T11:12:21.3276104Z  --name="${container_name}" \ 2025-12-04T11:12:21.3276239Z  --user jenkins \ 2025-12-04T11:12:21.3276390Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-12-04T11:12:21.3276556Z  -w /var/lib/jenkins/workspace \ 2025-12-04T11:12:21.3276754Z  "${DOCKER_IMAGE}" 2025-12-04T11:12:21.3276871Z ) 2025-12-04T11:12:21.3276987Z # save container name for later step 2025-12-04T11:12:21.3277155Z echo "CONTAINER_NAME=${container_name}" >> "$GITHUB_ENV" 2025-12-04T11:12:21.3277438Z # jenkins user does not have write permission to mounted workspace; work-around by copying within container to jenkins home 2025-12-04T11:12:21.3277794Z docker exec -t "${container_name}" sh -c "cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && ${TEST_COMMAND}" 2025-12-04T11:12:21.3282292Z shell: /usr/bin/bash -e {0} 2025-12-04T11:12:21.3282420Z env: 2025-12-04T11:12:21.3282526Z GIT_DEFAULT_BRANCH: main 2025-12-04T11:12:21.3282670Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T11:12:21.3282861Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T11:12:21.3283039Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T11:12:21.3283563Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T11:12:21.3284101Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T11:12:21.3284226Z AWS_REGION: us-east-1 2025-12-04T11:12:21.3284406Z AWS_ACCESS_KEY_ID: *** 2025-12-04T11:12:21.3284564Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T11:12:21.3286577Z AWS_SESSION_TOKEN: *** 2025-12-04T11:12:21.3286716Z BUILD_ENVIRONMENT: linux-noble-rocm-py3.12-mi300 2025-12-04T11:12:21.3286859Z PR_NUMBER: 2025-12-04T11:12:21.3286969Z GITHUB_REPOSITORY: pytorch/pytorch 2025-12-04T11:12:21.3287111Z GITHUB_WORKFLOW: periodic-rocm-mi300 2025-12-04T11:12:21.3287243Z GITHUB_JOB: test 2025-12-04T11:12:21.3287358Z GITHUB_RUN_ID: 19922798714 2025-12-04T11:12:21.3287477Z GITHUB_RUN_NUMBER: 1861 2025-12-04T11:12:21.3287591Z GITHUB_RUN_ATTEMPT: 1 2025-12-04T11:12:21.3287702Z JOB_ID: 57117547540 2025-12-04T11:12:21.3287950Z JOB_NAME: linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check) 2025-12-04T11:12:21.3288247Z BRANCH: main 2025-12-04T11:12:21.3288368Z SHA1: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:12:21.3288534Z BASE_SHA: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:12:21.3288680Z TEST_CONFIG: distributed 2025-12-04T11:12:21.3288795Z SHARD_NUMBER: 2 2025-12-04T11:12:21.3288900Z NUM_TEST_SHARDS: 3 2025-12-04T11:12:21.3289008Z REENABLED_ISSUES: 2025-12-04T11:12:21.3289121Z CONTINUE_THROUGH_ERROR: True 2025-12-04T11:12:21.3289244Z VERBOSE_TEST_LOGS: False 2025-12-04T11:12:21.3289363Z TEST_SHOWLOCALS: False 2025-12-04T11:12:21.3289478Z NO_TEST_TIMEOUT: False 2025-12-04T11:12:21.3289587Z NO_TD: False 2025-12-04T11:12:21.3289865Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:12:21.3290167Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 1 2025-12-04T11:12:21.3290304Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-12-04T11:12:21.3290432Z TESTS_TO_INCLUDE: 2025-12-04T11:12:21.3290542Z DASHBOARD_TAG: 2025-12-04T11:12:21.3290694Z HUGGING_FACE_HUB_TOKEN: *** 2025-12-04T11:12:21.3290816Z ##[endgroup] 2025-12-04T11:12:21.3308694Z + [[ distributed == \m\u\l\t\i\g\p\u ]] 2025-12-04T11:12:21.3308854Z + [[ linux-noble-rocm-py3.12-mi300 == *onnx* ]] 2025-12-04T11:12:21.3308998Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-12-04T11:12:21.3317639Z +++ nproc --ignore=2 2025-12-04T11:12:21.3329005Z ++ docker run --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e MAX_JOBS=254 -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e TESTS_TO_INCLUDE -e HUGGING_FACE_HUB_TOKEN -e DASHBOARD_TAG --env-file=/home/runner/_work/_temp/github_env_19922798714 --ulimit stack=10485760:83886080 --ulimit core=0 --env-file=/tmp/github_env_19922798714 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --shm-size=8g --tty --detach --name= --user jenkins -v /home/runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T11:12:21.4772079Z + container_name=5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T11:12:21.4772372Z + echo CONTAINER_NAME=5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T11:12:21.4773524Z + docker exec -t 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d sh -c 'cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && .ci/pytorch/test.sh' 2025-12-04T11:12:24.8909942Z Processing ./dist/torch-2.10.0a0+gitffd9b0f-cp312-cp312-linux_x86_64.whl 2025-12-04T11:12:25.4258736Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (3.18.0) 2025-12-04T11:12:25.4259200Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (4.12.2) 2025-12-04T11:12:25.4260585Z Requirement already satisfied: setuptools in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (78.1.1) 2025-12-04T11:12:25.4263487Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (1.13.3) 2025-12-04T11:12:25.4264402Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (2.8.8) 2025-12-04T11:12:25.4265138Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (3.1.6) 2025-12-04T11:12:25.4266136Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (2025.10.0) 2025-12-04T11:12:25.4312820Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from sympy>=1.13.3->torch==2.10.0a0+gitffd9b0f) (1.3.0) 2025-12-04T11:12:25.4337131Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from jinja2->torch==2.10.0a0+gitffd9b0f) (3.0.3) 2025-12-04T11:12:25.5682737Z Installing collected packages: torch 2025-12-04T11:12:31.4322131Z Successfully installed torch-2.10.0a0+gitffd9b0f 2025-12-04T11:12:31.4736486Z + export TERM=vt100 2025-12-04T11:12:31.4736683Z + TERM=vt100 2025-12-04T11:12:31.4740935Z ++ dirname .ci/pytorch/test.sh 2025-12-04T11:12:31.4752535Z + source .ci/pytorch/common.sh 2025-12-04T11:12:31.4757580Z +++ dirname .ci/pytorch/common.sh 2025-12-04T11:12:31.4769096Z ++ source .ci/pytorch/common_utils.sh 2025-12-04T11:12:31.4770701Z +++ declare -f -t trap_add 2025-12-04T11:12:31.4776245Z ++ set -ex -o pipefail 2025-12-04T11:12:31.4776468Z ++ [[ linux-noble-rocm-py3.12-mi300 == *rocm* ]] 2025-12-04T11:12:31.4776705Z ++ unset HIP_PLATFORM 2025-12-04T11:12:31.4776888Z ++ export PYTORCH_TEST_WITH_ROCM=1 2025-12-04T11:12:31.4777101Z ++ PYTORCH_TEST_WITH_ROCM=1 2025-12-04T11:12:31.4778946Z ++ BUILD_TEST_LIBTORCH=0 2025-12-04T11:12:31.4785085Z ++ dirname .ci/pytorch/test.sh 2025-12-04T11:12:31.4796241Z + source .ci/pytorch/common-build.sh 2025-12-04T11:12:31.4799698Z ++ [[ linux-noble-rocm-py3.12-mi300 != *win-* ]] 2025-12-04T11:12:31.4810146Z ++++ dirname .ci/pytorch/common-build.sh 2025-12-04T11:12:31.4821686Z +++ cd .ci/pytorch 2025-12-04T11:12:31.4821829Z +++ pwd -P 2025-12-04T11:12:31.4824521Z ++ script_dir=/var/lib/jenkins/pytorch/.ci/pytorch 2025-12-04T11:12:31.4824881Z ++ [[ linux-noble-rocm-py3.12-mi300 == *-pch* ]] 2025-12-04T11:12:31.4825078Z ++ which sccache 2025-12-04T11:12:31.4839410Z ++ [[ -z '' ]] 2025-12-04T11:12:31.4839541Z ++ unset SCCACHE_BUCKET 2025-12-04T11:12:31.4839673Z ++ unset SCCACHE_REGION 2025-12-04T11:12:31.4839797Z ++ sccache --stop-server 2025-12-04T11:12:31.4862540Z ++ true 2025-12-04T11:12:31.4862674Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-12-04T11:12:31.4871589Z ++ trap_add sccache_epilogue EXIT 2025-12-04T11:12:31.4871738Z ++ trap_add_cmd=sccache_epilogue 2025-12-04T11:12:31.4871884Z ++ shift 2025-12-04T11:12:31.4871997Z ++ for trap_add_name in "$@" 2025-12-04T11:12:31.4880814Z ++++ trap -p EXIT 2025-12-04T11:12:31.4882680Z +++ eval 'extract_trap_cmd ' 2025-12-04T11:12:31.4882814Z ++++ extract_trap_cmd 2025-12-04T11:12:31.4882946Z ++++ printf '%s\n' '' 2025-12-04T11:12:31.4883554Z +++ printf '%s\n' sccache_epilogue 2025-12-04T11:12:31.4885250Z ++ trap -- ' 2025-12-04T11:12:31.4885365Z sccache_epilogue' EXIT 2025-12-04T11:12:31.4885583Z ++ [[ -n '' ]] 2025-12-04T11:12:31.4885717Z ++ [[ linux-noble-rocm-py3.12-mi300 == *rocm* ]] 2025-12-04T11:12:31.4886000Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-12-04T11:12:31.4886160Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-12-04T11:12:31.4886282Z ++ sccache --start-server 2025-12-04T11:12:31.4902126Z sccache: Starting the server... 2025-12-04T11:12:31.5237830Z sccache: Listening on address 127.0.0.1:4226 2025-12-04T11:12:31.5249604Z ++ sccache --zero-stats 2025-12-04T11:12:31.5264493Z Statistics zeroed. 2025-12-04T11:12:31.5267882Z ++ which ccache 2025-12-04T11:12:31.5275724Z + [[ linux-noble-rocm-py3.12-mi300 != *rocm* ]] 2025-12-04T11:12:31.5275884Z + [[ linux-noble-rocm-py3.12-mi300 == *cuda* ]] 2025-12-04T11:12:31.5276032Z + echo 'Environment variables:' 2025-12-04T11:12:31.5276160Z Environment variables: 2025-12-04T11:12:31.5276272Z + env 2025-12-04T11:12:31.5285360Z GITHUB_WORKSPACE=/home/runner/_work/pytorch/pytorch 2025-12-04T11:12:31.5285535Z CONTINUE_THROUGH_ERROR=True 2025-12-04T11:12:31.5285676Z BUILD_ENVIRONMENT=linux-noble-rocm-py3.12-mi300 2025-12-04T11:12:31.5285855Z HOSTNAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv 2025-12-04T11:12:31.5286099Z GITHUB_PATH=/home/runner/_work/_temp/_runner_file_commands/add_path_bf8906a4-0709-4e0b-99f3-66cba6f90f50 2025-12-04T11:12:31.5286312Z GITHUB_ACTION=__run_2 2025-12-04T11:12:31.5286430Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T11:12:31.5286557Z GITHUB_RUN_NUMBER=1861 2025-12-04T11:12:31.5286666Z TEST_CONFIG=distributed 2025-12-04T11:12:31.5286811Z RUNNER_NAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv 2025-12-04T11:12:31.5286969Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-12-04T11:12:31.5287096Z AWS_DEFAULT_REGION=us-east-1 2025-12-04T11:12:31.5287233Z RUNNER_ARTIFACT_DIR=/home/runner/_work/_temp/artifacts 2025-12-04T11:12:31.5287381Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-12-04T11:12:31.5287507Z GITHUB_REF_TYPE=branch 2025-12-04T11:12:31.5287635Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:12:31.5287908Z HUGGING_FACE_HUB_TOKEN=*** 2025-12-04T11:12:31.5288308Z *** 2025-12-04T11:12:31.5288416Z GITHUB_REPOSITORY_ID=65600975 2025-12-04T11:12:31.5288536Z GITHUB_ACTIONS=true 2025-12-04T11:12:31.5288653Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:12:31.5288802Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:12:31.5289027Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/periodic-rocm-mi300.yml@refs/heads/main 2025-12-04T11:12:31.5289234Z UCC_HOME=/usr 2025-12-04T11:12:31.5289343Z RUNNER_ENVIRONMENT=self-hosted 2025-12-04T11:12:31.5289587Z VERBOSE_TEST_LOGS=False 2025-12-04T11:12:31.5289700Z GITHUB_REF=refs/heads/main 2025-12-04T11:12:31.5289811Z RUNNER_OS=Linux 2025-12-04T11:12:31.5289908Z SHARD_NUMBER=2 2025-12-04T11:12:31.5290064Z GITHUB_REF_PROTECTED=true 2025-12-04T11:12:31.5290180Z RUNNER_MANUALLY_TRAP_SIG=1 2025-12-04T11:12:31.5290289Z HOME=/var/lib/jenkins 2025-12-04T11:12:31.5290406Z GITHUB_API_URL=https://api.github.com 2025-12-04T11:12:31.5290544Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-12-04T11:12:31.5290682Z RUNNER_DOCS_DIR=/home/runner/_work/_temp/docs 2025-12-04T11:12:31.5290814Z LANG=C.UTF-8 2025-12-04T11:12:31.5290928Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e 2025-12-04T11:12:31.5291067Z PYTORCH_TEST_WITH_ROCM=1 2025-12-04T11:12:31.5291211Z RUNNER_TRACKING_ID=github_8890dc2f-279f-4663-b384-d74a6fcb36d4 2025-12-04T11:12:31.5291359Z RUNNER_ARCH=X64 2025-12-04T11:12:31.5291462Z RUNNER_TEMP=/home/runner/_work/_temp 2025-12-04T11:12:31.5291580Z NUM_TEST_SHARDS=3 2025-12-04T11:12:31.5291676Z UCX_HOME=/usr 2025-12-04T11:12:31.5291867Z GITHUB_STATE=/home/runner/_work/_temp/_runner_file_commands/save_state_bf8906a4-0709-4e0b-99f3-66cba6f90f50 2025-12-04T11:12:31.5292214Z JOB_NAME=linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check) 2025-12-04T11:12:31.5292516Z MAGMA_HOME=/opt/rocm/magma 2025-12-04T11:12:31.5292708Z GITHUB_ENV=/home/runner/_work/_temp/_runner_file_commands/set_env_bf8906a4-0709-4e0b-99f3-66cba6f90f50 2025-12-04T11:12:31.5292950Z GITHUB_EVENT_PATH=/home/runner/_work/_temp/_github_workflow/event.json 2025-12-04T11:12:31.5293112Z GITHUB_EVENT_NAME=schedule 2025-12-04T11:12:31.5293270Z GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT=actions-runner-controller/0.12.1 2025-12-04T11:12:31.5293435Z DASHBOARD_TAG= 2025-12-04T11:12:31.5293532Z GITHUB_RUN_ID=19922798714 2025-12-04T11:12:31.5293741Z GITHUB_STEP_SUMMARY=/home/runner/_work/_temp/_runner_file_commands/step_summary_bf8906a4-0709-4e0b-99f3-66cba6f90f50 2025-12-04T11:12:31.5293969Z GITHUB_ACTOR=pytorchmergebot 2025-12-04T11:12:31.5294080Z PR_NUMBER= 2025-12-04T11:12:31.5294173Z GITHUB_RUN_ATTEMPT=1 2025-12-04T11:12:31.5294280Z ANACONDA_PYTHON_VERSION=3.12 2025-12-04T11:12:31.5294415Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-12-04T11:12:31.5294552Z TERM=vt100 2025-12-04T11:12:31.5294641Z INSTALLED_VISION=yes 2025-12-04T11:12:31.5294740Z BRANCH=main 2025-12-04T11:12:31.5294838Z OPENSSL_ROOT_DIR=/opt/openssl 2025-12-04T11:12:31.5294948Z TESTS_TO_INCLUDE= 2025-12-04T11:12:31.5295106Z GITHUB_ACTION_PATH=/home/runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-12-04T11:12:31.5295296Z GITHUB_SERVER_URL=https://github.com 2025-12-04T11:12:31.5295434Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950;gfx1100 2025-12-04T11:12:31.5295585Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77 2025-12-04T11:12:31.5295718Z REENABLED_ISSUES= 2025-12-04T11:12:31.5295811Z SHLVL=1 2025-12-04T11:12:31.5295898Z MAX_JOBS=254 2025-12-04T11:12:31.5296032Z RUNNER_TEST_RESULTS_DIR=/home/runner/_work/_temp/test-results 2025-12-04T11:12:31.5296185Z GITHUB_ACTOR_ID=97764156 2025-12-04T11:12:31.5296302Z RUNNER_TOOL_CACHE=/home/runner/_work/_tool 2025-12-04T11:12:31.5296463Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:12:31.5296615Z GITHUB_REF_NAME=main 2025-12-04T11:12:31.5296716Z ROCM_PATH=/opt/rocm 2025-12-04T11:12:31.5296811Z GITHUB_JOB=test 2025-12-04T11:12:31.5296909Z NO_TEST_TIMEOUT=False 2025-12-04T11:12:31.5297020Z GITHUB_REPOSITORY=pytorch/pytorch 2025-12-04T11:12:31.5297137Z LC_ALL=C.UTF-8 2025-12-04T11:12:31.5297233Z GITHUB_RETENTION_DAYS=90 2025-12-04T11:12:31.5297351Z RUNNER_WORKSPACE=/home/runner/_work/pytorch 2025-12-04T11:12:31.5297479Z OPENSSL_DIR=/opt/openssl 2025-12-04T11:12:31.5297590Z GITHUB_ACTION_REPOSITORY= 2025-12-04T11:12:31.5297944Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.12/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T11:12:31.5298388Z GITHUB_BASE_REF= 2025-12-04T11:12:31.5298481Z CI=true 2025-12-04T11:12:31.5298576Z GITHUB_REPOSITORY_OWNER=pytorch 2025-12-04T11:12:31.5298687Z JOB_ID=57117547540 2025-12-04T11:12:31.5298781Z GITHUB_HEAD_REF= 2025-12-04T11:12:31.5298877Z GITHUB_ACTION_REF= 2025-12-04T11:12:31.5298978Z TEST_SHOWLOCALS=False 2025-12-04T11:12:31.5299091Z GITHUB_WORKFLOW=periodic-rocm-mi300 2025-12-04T11:12:31.5299218Z DEBIAN_FRONTEND=noninteractive 2025-12-04T11:12:31.5299424Z GITHUB_OUTPUT=/home/runner/_work/_temp/_runner_file_commands/set_output_bf8906a4-0709-4e0b-99f3-66cba6f90f50 2025-12-04T11:12:31.5299628Z NO_TD=False 2025-12-04T11:12:31.5299720Z OLDPWD=/var/lib/jenkins 2025-12-04T11:12:31.5299821Z _=/usr/bin/env 2025-12-04T11:12:31.5299950Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-12-04T11:12:31.5355563Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch 2025-12-04T11:12:31.5356984Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/bin 2025-12-04T11:12:31.5357300Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/lib 2025-12-04T11:12:31.5357528Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/test 2025-12-04T11:12:31.5357707Z + BUILD_DIR=build 2025-12-04T11:12:31.5358301Z + BUILD_RENAMED_DIR=build_renamed 2025-12-04T11:12:31.5358437Z + BUILD_BIN_DIR=build/bin 2025-12-04T11:12:31.5358568Z + SHARD_NUMBER=2 2025-12-04T11:12:31.5358671Z + NUM_TEST_SHARDS=3 2025-12-04T11:12:31.5358785Z + export TORCH_SERIALIZATION_DEBUG=1 2025-12-04T11:12:31.5358920Z + TORCH_SERIALIZATION_DEBUG=1 2025-12-04T11:12:31.5359039Z + export VALGRIND=ON 2025-12-04T11:12:31.5359143Z + VALGRIND=ON 2025-12-04T11:12:31.5359263Z + [[ linux-noble-rocm-py3.12-mi300 == *clang9* ]] 2025-12-04T11:12:31.5359428Z + [[ linux-noble-rocm-py3.12-mi300 == *xpu* ]] 2025-12-04T11:12:31.5359564Z + detect_cuda_arch 2025-12-04T11:12:31.5359684Z + [[ linux-noble-rocm-py3.12-mi300 == *cuda* ]] 2025-12-04T11:12:31.5359838Z + [[ linux-noble-rocm-py3.12-mi300 == *s390x* ]] 2025-12-04T11:12:31.5359974Z + [[ 0 == \1 ]] 2025-12-04T11:12:31.5360075Z + [[ True == \1 ]] 2025-12-04T11:12:31.5360190Z + [[ linux-noble-rocm-py3.12-mi300 != *bazel* ]] 2025-12-04T11:12:31.5360340Z ++ realpath build/custom_test_artifacts 2025-12-04T11:12:31.5366040Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/pytorch/build/custom_test_artifacts 2025-12-04T11:12:31.5366238Z + [[ -n '' ]] 2025-12-04T11:12:31.5366343Z + echo 'Environment variables' 2025-12-04T11:12:31.5366464Z Environment variables 2025-12-04T11:12:31.5366571Z + env 2025-12-04T11:12:31.5372328Z GITHUB_WORKSPACE=/home/runner/_work/pytorch/pytorch 2025-12-04T11:12:31.5372516Z CONTINUE_THROUGH_ERROR=True 2025-12-04T11:12:31.5372659Z BUILD_ENVIRONMENT=linux-noble-rocm-py3.12-mi300 2025-12-04T11:12:31.5372835Z HOSTNAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv 2025-12-04T11:12:31.5373085Z GITHUB_PATH=/home/runner/_work/_temp/_runner_file_commands/add_path_bf8906a4-0709-4e0b-99f3-66cba6f90f50 2025-12-04T11:12:31.5373302Z GITHUB_ACTION=__run_2 2025-12-04T11:12:31.5373428Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T11:12:31.5373551Z GITHUB_RUN_NUMBER=1861 2025-12-04T11:12:31.5373660Z TEST_CONFIG=distributed 2025-12-04T11:12:31.5373806Z RUNNER_NAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv 2025-12-04T11:12:31.5373973Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-12-04T11:12:31.5374104Z AWS_DEFAULT_REGION=us-east-1 2025-12-04T11:12:31.5374246Z RUNNER_ARTIFACT_DIR=/home/runner/_work/_temp/artifacts 2025-12-04T11:12:31.5374401Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-12-04T11:12:31.5374586Z GITHUB_REF_TYPE=branch 2025-12-04T11:12:31.5374715Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:12:31.5375028Z HUGGING_FACE_HUB_TOKEN=*** 2025-12-04T11:12:31.5375220Z *** 2025-12-04T11:12:31.5375319Z GITHUB_REPOSITORY_ID=65600975 2025-12-04T11:12:31.5375438Z GITHUB_ACTIONS=true 2025-12-04T11:12:31.5375559Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:12:31.5375877Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:12:31.5376108Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/periodic-rocm-mi300.yml@refs/heads/main 2025-12-04T11:12:31.5376314Z UCC_HOME=/usr 2025-12-04T11:12:31.5376417Z TORCH_SERIALIZATION_DEBUG=1 2025-12-04T11:12:31.5376537Z RUNNER_ENVIRONMENT=self-hosted 2025-12-04T11:12:31.5376657Z VERBOSE_TEST_LOGS=False 2025-12-04T11:12:31.5376770Z GITHUB_REF=refs/heads/main 2025-12-04T11:12:31.5376879Z RUNNER_OS=Linux 2025-12-04T11:12:31.5376977Z SHARD_NUMBER=2 2025-12-04T11:12:31.5377080Z GITHUB_REF_PROTECTED=true 2025-12-04T11:12:31.5377198Z RUNNER_MANUALLY_TRAP_SIG=1 2025-12-04T11:12:31.5377312Z HOME=/var/lib/jenkins 2025-12-04T11:12:31.5377444Z GITHUB_API_URL=https://api.github.com 2025-12-04T11:12:31.5377586Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-12-04T11:12:31.5377732Z RUNNER_DOCS_DIR=/home/runner/_work/_temp/docs 2025-12-04T11:12:31.5377866Z LANG=C.UTF-8 2025-12-04T11:12:31.5377985Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e 2025-12-04T11:12:31.5378136Z PYTORCH_TEST_WITH_ROCM=1 2025-12-04T11:12:31.5378336Z RUNNER_TRACKING_ID=github_8890dc2f-279f-4663-b384-d74a6fcb36d4 2025-12-04T11:12:31.5378490Z RUNNER_ARCH=X64 2025-12-04T11:12:31.5378600Z RUNNER_TEMP=/home/runner/_work/_temp 2025-12-04T11:12:31.5378785Z NUM_TEST_SHARDS=3 2025-12-04T11:12:31.5378887Z UCX_HOME=/usr 2025-12-04T11:12:31.5379087Z GITHUB_STATE=/home/runner/_work/_temp/_runner_file_commands/save_state_bf8906a4-0709-4e0b-99f3-66cba6f90f50 2025-12-04T11:12:31.5379447Z JOB_NAME=linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check) 2025-12-04T11:12:31.5379711Z MAGMA_HOME=/opt/rocm/magma 2025-12-04T11:12:31.5379914Z GITHUB_ENV=/home/runner/_work/_temp/_runner_file_commands/set_env_bf8906a4-0709-4e0b-99f3-66cba6f90f50 2025-12-04T11:12:31.5380162Z GITHUB_EVENT_PATH=/home/runner/_work/_temp/_github_workflow/event.json 2025-12-04T11:12:31.5380331Z GITHUB_EVENT_NAME=schedule 2025-12-04T11:12:31.5380500Z GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT=actions-runner-controller/0.12.1 2025-12-04T11:12:31.5380670Z DASHBOARD_TAG= 2025-12-04T11:12:31.5380774Z GITHUB_RUN_ID=19922798714 2025-12-04T11:12:31.5380993Z GITHUB_STEP_SUMMARY=/home/runner/_work/_temp/_runner_file_commands/step_summary_bf8906a4-0709-4e0b-99f3-66cba6f90f50 2025-12-04T11:12:31.5381229Z GITHUB_ACTOR=pytorchmergebot 2025-12-04T11:12:31.5381347Z PR_NUMBER= 2025-12-04T11:12:31.5381450Z GITHUB_RUN_ATTEMPT=1 2025-12-04T11:12:31.5381560Z VALGRIND=ON 2025-12-04T11:12:31.5381664Z ANACONDA_PYTHON_VERSION=3.12 2025-12-04T11:12:31.5381805Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-12-04T11:12:31.5381944Z TERM=vt100 2025-12-04T11:12:31.5382035Z INSTALLED_VISION=yes 2025-12-04T11:12:31.5382139Z BRANCH=main 2025-12-04T11:12:31.5382239Z OPENSSL_ROOT_DIR=/opt/openssl 2025-12-04T11:12:31.5382358Z TESTS_TO_INCLUDE= 2025-12-04T11:12:31.5382523Z GITHUB_ACTION_PATH=/home/runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-12-04T11:12:31.5382719Z GITHUB_SERVER_URL=https://github.com 2025-12-04T11:12:31.5382862Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950;gfx1100 2025-12-04T11:12:31.5383017Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77 2025-12-04T11:12:31.5383156Z REENABLED_ISSUES= 2025-12-04T11:12:31.5383255Z SHLVL=1 2025-12-04T11:12:31.5383351Z MAX_JOBS=254 2025-12-04T11:12:31.5383487Z RUNNER_TEST_RESULTS_DIR=/home/runner/_work/_temp/test-results 2025-12-04T11:12:31.5383656Z GITHUB_ACTOR_ID=97764156 2025-12-04T11:12:31.5383779Z RUNNER_TOOL_CACHE=/home/runner/_work/_tool 2025-12-04T11:12:31.5383943Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T11:12:31.5384099Z GITHUB_REF_NAME=main 2025-12-04T11:12:31.5384205Z ROCM_PATH=/opt/rocm 2025-12-04T11:12:31.5384308Z GITHUB_JOB=test 2025-12-04T11:12:31.5384411Z NO_TEST_TIMEOUT=False 2025-12-04T11:12:31.5384527Z GITHUB_REPOSITORY=pytorch/pytorch 2025-12-04T11:12:31.5384649Z LC_ALL=C.UTF-8 2025-12-04T11:12:31.5384753Z GITHUB_RETENTION_DAYS=90 2025-12-04T11:12:31.5384920Z RUNNER_WORKSPACE=/home/runner/_work/pytorch 2025-12-04T11:12:31.5385052Z OPENSSL_DIR=/opt/openssl 2025-12-04T11:12:31.5385160Z GITHUB_ACTION_REPOSITORY= 2025-12-04T11:12:31.5385516Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.12/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T11:12:31.5385868Z GITHUB_BASE_REF= 2025-12-04T11:12:31.5385961Z CI=true 2025-12-04T11:12:31.5386056Z GITHUB_REPOSITORY_OWNER=pytorch 2025-12-04T11:12:31.5386171Z JOB_ID=57117547540 2025-12-04T11:12:31.5386265Z GITHUB_HEAD_REF= 2025-12-04T11:12:31.5386361Z GITHUB_ACTION_REF= 2025-12-04T11:12:31.5386456Z TEST_SHOWLOCALS=False 2025-12-04T11:12:31.5386567Z GITHUB_WORKFLOW=periodic-rocm-mi300 2025-12-04T11:12:31.5386698Z DEBIAN_FRONTEND=noninteractive 2025-12-04T11:12:31.5386906Z GITHUB_OUTPUT=/home/runner/_work/_temp/_runner_file_commands/set_output_bf8906a4-0709-4e0b-99f3-66cba6f90f50 2025-12-04T11:12:31.5387111Z NO_TD=False 2025-12-04T11:12:31.5398440Z OLDPWD=/var/lib/jenkins 2025-12-04T11:12:31.5398557Z _=/usr/bin/env 2025-12-04T11:12:31.5398661Z + echo 'Testing pytorch' 2025-12-04T11:12:31.5398770Z Testing pytorch 2025-12-04T11:12:31.5398869Z + export LANG=C.UTF-8 2025-12-04T11:12:31.5399054Z + LANG=C.UTF-8 2025-12-04T11:12:31.5399147Z + PR_NUMBER= 2025-12-04T11:12:31.5399242Z + [[ distributed == \d\e\f\a\u\l\t ]] 2025-12-04T11:12:31.5399375Z + [[ distributed == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-12-04T11:12:31.5399520Z + [[ linux-noble-rocm-py3.12-mi300 == *rocm* ]] 2025-12-04T11:12:31.5399664Z + export HIP_VISIBLE_DEVICES=0,1,2,3 2025-12-04T11:12:31.5399795Z + HIP_VISIBLE_DEVICES=0,1,2,3 2025-12-04T11:12:31.5399917Z + [[ distributed == \s\l\o\w ]] 2025-12-04T11:12:31.5400063Z + [[ linux-noble-rocm-py3.12-mi300 == *slow-gradcheck* ]] 2025-12-04T11:12:31.5400227Z + [[ linux-noble-rocm-py3.12-mi300 == *cuda* ]] 2025-12-04T11:12:31.5400374Z + [[ linux-noble-rocm-py3.12-mi300 == *rocm* ]] 2025-12-04T11:12:31.5400521Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-12-04T11:12:31.5400661Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-12-04T11:12:31.5400796Z + [[ distributed == *crossref* ]] 2025-12-04T11:12:31.5400928Z + [[ linux-noble-rocm-py3.12-mi300 == *rocm* ]] 2025-12-04T11:12:31.5401060Z + export VALGRIND=OFF 2025-12-04T11:12:31.5401171Z + VALGRIND=OFF 2025-12-04T11:12:31.5401266Z + rocminfo 2025-12-04T11:12:31.5494959Z ROCk module version 6.12.12 is loaded 2025-12-04T11:12:31.6276415Z ===================== 2025-12-04T11:12:31.6276591Z HSA System Attributes 2025-12-04T11:12:31.6276710Z ===================== 2025-12-04T11:12:31.6276823Z Runtime Version: 1.18 2025-12-04T11:12:31.6276942Z Runtime Ext Version: 1.14 2025-12-04T11:12:31.6277068Z System Timestamp Freq.: 1000.000000MHz 2025-12-04T11:12:31.6277266Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-12-04T11:12:31.6277479Z Machine Model: LARGE 2025-12-04T11:12:31.6277659Z System Endianness: LITTLE 2025-12-04T11:12:31.6277806Z Mwaitx: DISABLED 2025-12-04T11:12:31.6277926Z XNACK enabled: NO 2025-12-04T11:12:31.6278044Z DMAbuf Support: YES 2025-12-04T11:12:31.6278352Z VMM Support: YES 2025-12-04T11:12:31.6278432Z 2025-12-04T11:12:31.6278474Z ========== 2025-12-04T11:12:31.6278582Z HSA Agents 2025-12-04T11:12:31.6278691Z ========== 2025-12-04T11:12:31.6278805Z ******* 2025-12-04T11:12:31.6278908Z Agent 1 2025-12-04T11:12:31.6279010Z ******* 2025-12-04T11:12:31.6279142Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T11:12:31.6279299Z Uuid: CPU-XX 2025-12-04T11:12:31.6279460Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T11:12:31.6279715Z Vendor Name: CPU 2025-12-04T11:12:31.6279875Z Feature: None specified 2025-12-04T11:12:31.6280036Z Profile: FULL_PROFILE 2025-12-04T11:12:31.6280201Z Float Round Mode: NEAR 2025-12-04T11:12:31.6280384Z Max Queue Number: 0(0x0) 2025-12-04T11:12:31.6280546Z Queue Min Size: 0(0x0) 2025-12-04T11:12:31.6280710Z Queue Max Size: 0(0x0) 2025-12-04T11:12:31.6280870Z Queue Type: MULTI 2025-12-04T11:12:31.6281032Z Node: 0 2025-12-04T11:12:31.6281190Z Device Type: CPU 2025-12-04T11:12:31.6281342Z Cache Info: 2025-12-04T11:12:31.6281472Z L1: 49152(0xc000) KB 2025-12-04T11:12:31.6281629Z Chip ID: 0(0x0) 2025-12-04T11:12:31.6281791Z ASIC Revision: 0(0x0) 2025-12-04T11:12:31.6281956Z Cacheline Size: 64(0x40) 2025-12-04T11:12:31.6282120Z Max Clock Freq. (MHz): 3300 2025-12-04T11:12:31.6282345Z BDFID: 0 2025-12-04T11:12:31.6282511Z Internal Node ID: 0 2025-12-04T11:12:31.6282792Z Compute Unit: 128 2025-12-04T11:12:31.6282955Z SIMDs per CU: 0 2025-12-04T11:12:31.6283115Z Shader Engines: 0 2025-12-04T11:12:31.6283282Z Shader Arrs. per Eng.: 0 2025-12-04T11:12:31.6283451Z WatchPts on Addr. Ranges:1 2025-12-04T11:12:31.6283608Z Memory Properties: 2025-12-04T11:12:31.6283728Z Features: None 2025-12-04T11:12:31.6283847Z Pool Info: 2025-12-04T11:12:31.6284006Z Pool 1 2025-12-04T11:12:31.6284152Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:12:31.6284325Z Size: 1584755152(0x5e7571d0) KB 2025-12-04T11:12:31.6284483Z Allocatable: TRUE 2025-12-04T11:12:31.6284650Z Alloc Granule: 4KB 2025-12-04T11:12:31.6284828Z Alloc Recommended Granule:4KB 2025-12-04T11:12:31.6284999Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6285165Z Accessible by all: TRUE 2025-12-04T11:12:31.6285309Z Pool 2 2025-12-04T11:12:31.6285451Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:12:31.6285615Z Size: 1584755152(0x5e7571d0) KB 2025-12-04T11:12:31.6285771Z Allocatable: TRUE 2025-12-04T11:12:31.6285936Z Alloc Granule: 4KB 2025-12-04T11:12:31.6286111Z Alloc Recommended Granule:4KB 2025-12-04T11:12:31.6286283Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6286449Z Accessible by all: TRUE 2025-12-04T11:12:31.6286596Z Pool 3 2025-12-04T11:12:31.6286735Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T11:12:31.6286893Z Size: 1584755152(0x5e7571d0) KB 2025-12-04T11:12:31.6287049Z Allocatable: TRUE 2025-12-04T11:12:31.6287212Z Alloc Granule: 4KB 2025-12-04T11:12:31.6287421Z Alloc Recommended Granule:4KB 2025-12-04T11:12:31.6287595Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6287763Z Accessible by all: TRUE 2025-12-04T11:12:31.6287914Z Pool 4 2025-12-04T11:12:31.6288054Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:12:31.6288263Z Size: 1584755152(0x5e7571d0) KB 2025-12-04T11:12:31.6288421Z Allocatable: TRUE 2025-12-04T11:12:31.6288585Z Alloc Granule: 4KB 2025-12-04T11:12:31.6288757Z Alloc Recommended Granule:4KB 2025-12-04T11:12:31.6288927Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6289097Z Accessible by all: TRUE 2025-12-04T11:12:31.6289247Z ISA Info: 2025-12-04T11:12:31.6289362Z ******* 2025-12-04T11:12:31.6289471Z Agent 2 2025-12-04T11:12:31.6289581Z ******* 2025-12-04T11:12:31.6289707Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T11:12:31.6289909Z Uuid: CPU-XX 2025-12-04T11:12:31.6290073Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T11:12:31.6290241Z Vendor Name: CPU 2025-12-04T11:12:31.6290407Z Feature: None specified 2025-12-04T11:12:31.6290570Z Profile: FULL_PROFILE 2025-12-04T11:12:31.6290731Z Float Round Mode: NEAR 2025-12-04T11:12:31.6290893Z Max Queue Number: 0(0x0) 2025-12-04T11:12:31.6291057Z Queue Min Size: 0(0x0) 2025-12-04T11:12:31.6291216Z Queue Max Size: 0(0x0) 2025-12-04T11:12:31.6291375Z Queue Type: MULTI 2025-12-04T11:12:31.6291525Z Node: 1 2025-12-04T11:12:31.6291682Z Device Type: CPU 2025-12-04T11:12:31.6291826Z Cache Info: 2025-12-04T11:12:31.6291949Z L1: 49152(0xc000) KB 2025-12-04T11:12:31.6292093Z Chip ID: 0(0x0) 2025-12-04T11:12:31.6292245Z ASIC Revision: 0(0x0) 2025-12-04T11:12:31.6292405Z Cacheline Size: 64(0x40) 2025-12-04T11:12:31.6292567Z Max Clock Freq. (MHz): 3300 2025-12-04T11:12:31.6292722Z BDFID: 0 2025-12-04T11:12:31.6292877Z Internal Node ID: 1 2025-12-04T11:12:31.6293037Z Compute Unit: 128 2025-12-04T11:12:31.6293195Z SIMDs per CU: 0 2025-12-04T11:12:31.6293361Z Shader Engines: 0 2025-12-04T11:12:31.6293532Z Shader Arrs. per Eng.: 0 2025-12-04T11:12:31.6293700Z WatchPts on Addr. Ranges:1 2025-12-04T11:12:31.6293851Z Memory Properties: 2025-12-04T11:12:31.6293971Z Features: None 2025-12-04T11:12:31.6294088Z Pool Info: 2025-12-04T11:12:31.6294198Z Pool 1 2025-12-04T11:12:31.6294337Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:12:31.6294499Z Size: 1585284308(0x5e7d84d4) KB 2025-12-04T11:12:31.6294703Z Allocatable: TRUE 2025-12-04T11:12:31.6294870Z Alloc Granule: 4KB 2025-12-04T11:12:31.6295042Z Alloc Recommended Granule:4KB 2025-12-04T11:12:31.6295213Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6295387Z Accessible by all: TRUE 2025-12-04T11:12:31.6295534Z Pool 2 2025-12-04T11:12:31.6295676Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:12:31.6295834Z Size: 1585284308(0x5e7d84d4) KB 2025-12-04T11:12:31.6295993Z Allocatable: TRUE 2025-12-04T11:12:31.6296155Z Alloc Granule: 4KB 2025-12-04T11:12:31.6296321Z Alloc Recommended Granule:4KB 2025-12-04T11:12:31.6296491Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6296653Z Accessible by all: TRUE 2025-12-04T11:12:31.6296800Z Pool 3 2025-12-04T11:12:31.6296931Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T11:12:31.6297117Z Size: 1585284308(0x5e7d84d4) KB 2025-12-04T11:12:31.6297271Z Allocatable: TRUE 2025-12-04T11:12:31.6297530Z Alloc Granule: 4KB 2025-12-04T11:12:31.6297696Z Alloc Recommended Granule:4KB 2025-12-04T11:12:31.6297861Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6298041Z Accessible by all: TRUE 2025-12-04T11:12:31.6298214Z Pool 4 2025-12-04T11:12:31.6298353Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:12:31.6298508Z Size: 1585284308(0x5e7d84d4) KB 2025-12-04T11:12:31.6298701Z Allocatable: TRUE 2025-12-04T11:12:31.6298918Z Alloc Granule: 4KB 2025-12-04T11:12:31.6299090Z Alloc Recommended Granule:4KB 2025-12-04T11:12:31.6299260Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6299425Z Accessible by all: TRUE 2025-12-04T11:12:31.6299609Z ISA Info: 2025-12-04T11:12:31.6299715Z ******* 2025-12-04T11:12:31.6299834Z Agent 3 2025-12-04T11:12:31.6299936Z ******* 2025-12-04T11:12:31.6300050Z Name: gfx942 2025-12-04T11:12:31.6300194Z Uuid: GPU-e92b40ee81585045 2025-12-04T11:12:31.6300354Z Marketing Name: AMD Radeon Graphics 2025-12-04T11:12:31.6300517Z Vendor Name: AMD 2025-12-04T11:12:31.6300671Z Feature: KERNEL_DISPATCH 2025-12-04T11:12:31.6300869Z Profile: BASE_PROFILE 2025-12-04T11:12:31.6301032Z Float Round Mode: NEAR 2025-12-04T11:12:31.6301192Z Max Queue Number: 128(0x80) 2025-12-04T11:12:31.6301349Z Queue Min Size: 64(0x40) 2025-12-04T11:12:31.6301503Z Queue Max Size: 131072(0x20000) 2025-12-04T11:12:31.6301668Z Queue Type: MULTI 2025-12-04T11:12:31.6301812Z Node: 2 2025-12-04T11:12:31.6301963Z Device Type: GPU 2025-12-04T11:12:31.6302100Z Cache Info: 2025-12-04T11:12:31.6302274Z L1: 32(0x20) KB 2025-12-04T11:12:31.6302409Z L2: 4096(0x1000) KB 2025-12-04T11:12:31.6302544Z L3: 262144(0x40000) KB 2025-12-04T11:12:31.6302685Z Chip ID: 29861(0x74a5) 2025-12-04T11:12:31.6302835Z ASIC Revision: 1(0x1) 2025-12-04T11:12:31.6303006Z Cacheline Size: 128(0x80) 2025-12-04T11:12:31.6303162Z Max Clock Freq. (MHz): 2100 2025-12-04T11:12:31.6303371Z BDFID: 62720 2025-12-04T11:12:31.6303556Z Internal Node ID: 2 2025-12-04T11:12:31.6303721Z Compute Unit: 304 2025-12-04T11:12:31.6303954Z SIMDs per CU: 4 2025-12-04T11:12:31.6304116Z Shader Engines: 32 2025-12-04T11:12:31.6304278Z Shader Arrs. per Eng.: 1 2025-12-04T11:12:31.6304483Z WatchPts on Addr. Ranges:4 2025-12-04T11:12:31.6304788Z Coherent Host Access: FALSE 2025-12-04T11:12:31.6304938Z Memory Properties: 2025-12-04T11:12:31.6305062Z Features: KERNEL_DISPATCH 2025-12-04T11:12:31.6305225Z Fast F16 Operation: TRUE 2025-12-04T11:12:31.6305452Z Wavefront Size: 64(0x40) 2025-12-04T11:12:31.6305657Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6305803Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6305938Z x 1024(0x400) 2025-12-04T11:12:31.6306072Z y 1024(0x400) 2025-12-04T11:12:31.6306233Z z 1024(0x400) 2025-12-04T11:12:31.6306377Z Max Waves Per CU: 32(0x20) 2025-12-04T11:12:31.6306535Z Max Work-item Per CU: 2048(0x800) 2025-12-04T11:12:31.6306718Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6306890Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6307009Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6307142Z y 65535(0xffff) 2025-12-04T11:12:31.6307283Z z 65535(0xffff) 2025-12-04T11:12:31.6307431Z Max fbarriers/Workgrp: 32 2025-12-04T11:12:31.6307650Z Packet Processor uCode:: 185 2025-12-04T11:12:31.6307820Z SDMA engine uCode:: 24 2025-12-04T11:12:31.6307984Z IOMMU Support:: None 2025-12-04T11:12:31.6308125Z Pool Info: 2025-12-04T11:12:31.6308316Z Pool 1 2025-12-04T11:12:31.6308462Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:12:31.6308631Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6308794Z Allocatable: TRUE 2025-12-04T11:12:31.6308963Z Alloc Granule: 4KB 2025-12-04T11:12:31.6309139Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6309318Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6309492Z Accessible by all: FALSE 2025-12-04T11:12:31.6309642Z Pool 2 2025-12-04T11:12:31.6309785Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:12:31.6310071Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6310245Z Allocatable: TRUE 2025-12-04T11:12:31.6310412Z Alloc Granule: 4KB 2025-12-04T11:12:31.6310635Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6310807Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6311194Z Accessible by all: FALSE 2025-12-04T11:12:31.6311357Z Pool 3 2025-12-04T11:12:31.6311496Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:12:31.6311660Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6311819Z Allocatable: TRUE 2025-12-04T11:12:31.6312089Z Alloc Granule: 4KB 2025-12-04T11:12:31.6312273Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6312498Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6312669Z Accessible by all: FALSE 2025-12-04T11:12:31.6312818Z Pool 4 2025-12-04T11:12:31.6313029Z Segment: GROUP 2025-12-04T11:12:31.6313184Z Size: 64(0x40) KB 2025-12-04T11:12:31.6313349Z Allocatable: FALSE 2025-12-04T11:12:31.6313548Z Alloc Granule: 0KB 2025-12-04T11:12:31.6313737Z Alloc Recommended Granule:0KB 2025-12-04T11:12:31.6313910Z Alloc Alignment: 0KB 2025-12-04T11:12:31.6314079Z Accessible by all: FALSE 2025-12-04T11:12:31.6314228Z ISA Info: 2025-12-04T11:12:31.6314355Z ISA 1 2025-12-04T11:12:31.6314503Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T11:12:31.6314685Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:12:31.6314902Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:12:31.6315085Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6315337Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6315548Z Fast f16: TRUE 2025-12-04T11:12:31.6315713Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6315869Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6316015Z x 1024(0x400) 2025-12-04T11:12:31.6316160Z y 1024(0x400) 2025-12-04T11:12:31.6316305Z z 1024(0x400) 2025-12-04T11:12:31.6316459Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6316612Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6316806Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6316991Z y 65535(0xffff) 2025-12-04T11:12:31.6317134Z z 65535(0xffff) 2025-12-04T11:12:31.6317292Z FBarrier Max Size: 32 2025-12-04T11:12:31.6317449Z ISA 2 2025-12-04T11:12:31.6317604Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T11:12:31.6317792Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:12:31.6317967Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:12:31.6318238Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6318414Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6318580Z Fast f16: TRUE 2025-12-04T11:12:31.6318794Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6318979Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6319118Z x 1024(0x400) 2025-12-04T11:12:31.6319265Z y 1024(0x400) 2025-12-04T11:12:31.6319418Z z 1024(0x400) 2025-12-04T11:12:31.6319572Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6319723Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6319857Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6320004Z y 65535(0xffff) 2025-12-04T11:12:31.6320145Z z 65535(0xffff) 2025-12-04T11:12:31.6320301Z FBarrier Max Size: 32 2025-12-04T11:12:31.6320448Z ******* 2025-12-04T11:12:31.6320666Z Agent 4 2025-12-04T11:12:31.6320770Z ******* 2025-12-04T11:12:31.6320895Z Name: gfx942 2025-12-04T11:12:31.6321084Z Uuid: GPU-0f23c118dd1bca7f 2025-12-04T11:12:31.6321250Z Marketing Name: AMD Radeon Graphics 2025-12-04T11:12:31.6321441Z Vendor Name: AMD 2025-12-04T11:12:31.6321605Z Feature: KERNEL_DISPATCH 2025-12-04T11:12:31.6321768Z Profile: BASE_PROFILE 2025-12-04T11:12:31.6321938Z Float Round Mode: NEAR 2025-12-04T11:12:31.6322116Z Max Queue Number: 128(0x80) 2025-12-04T11:12:31.6322279Z Queue Min Size: 64(0x40) 2025-12-04T11:12:31.6322450Z Queue Max Size: 131072(0x20000) 2025-12-04T11:12:31.6322615Z Queue Type: MULTI 2025-12-04T11:12:31.6322799Z Node: 3 2025-12-04T11:12:31.6322953Z Device Type: GPU 2025-12-04T11:12:31.6323096Z Cache Info: 2025-12-04T11:12:31.6323230Z L1: 32(0x20) KB 2025-12-04T11:12:31.6323378Z L2: 4096(0x1000) KB 2025-12-04T11:12:31.6323520Z L3: 262144(0x40000) KB 2025-12-04T11:12:31.6323668Z Chip ID: 29861(0x74a5) 2025-12-04T11:12:31.6323828Z ASIC Revision: 1(0x1) 2025-12-04T11:12:31.6323992Z Cacheline Size: 128(0x80) 2025-12-04T11:12:31.6324158Z Max Clock Freq. (MHz): 2100 2025-12-04T11:12:31.6324321Z BDFID: 34048 2025-12-04T11:12:31.6324480Z Internal Node ID: 3 2025-12-04T11:12:31.6324645Z Compute Unit: 304 2025-12-04T11:12:31.6324806Z SIMDs per CU: 4 2025-12-04T11:12:31.6324970Z Shader Engines: 32 2025-12-04T11:12:31.6325139Z Shader Arrs. per Eng.: 1 2025-12-04T11:12:31.6325310Z WatchPts on Addr. Ranges:4 2025-12-04T11:12:31.6325483Z Coherent Host Access: FALSE 2025-12-04T11:12:31.6325721Z Memory Properties: 2025-12-04T11:12:31.6325848Z Features: KERNEL_DISPATCH 2025-12-04T11:12:31.6326002Z Fast F16 Operation: TRUE 2025-12-04T11:12:31.6326169Z Wavefront Size: 64(0x40) 2025-12-04T11:12:31.6326340Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6326497Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6326635Z x 1024(0x400) 2025-12-04T11:12:31.6326776Z y 1024(0x400) 2025-12-04T11:12:31.6326914Z z 1024(0x400) 2025-12-04T11:12:31.6327065Z Max Waves Per CU: 32(0x20) 2025-12-04T11:12:31.6327233Z Max Work-item Per CU: 2048(0x800) 2025-12-04T11:12:31.6327399Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6327554Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6327683Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6327824Z y 65535(0xffff) 2025-12-04T11:12:31.6327964Z z 65535(0xffff) 2025-12-04T11:12:31.6328240Z Max fbarriers/Workgrp: 32 2025-12-04T11:12:31.6328417Z Packet Processor uCode:: 185 2025-12-04T11:12:31.6328590Z SDMA engine uCode:: 24 2025-12-04T11:12:31.6328757Z IOMMU Support:: None 2025-12-04T11:12:31.6328904Z Pool Info: 2025-12-04T11:12:31.6329020Z Pool 1 2025-12-04T11:12:31.6329166Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:12:31.6329329Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6329497Z Allocatable: TRUE 2025-12-04T11:12:31.6329666Z Alloc Granule: 4KB 2025-12-04T11:12:31.6329843Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6330024Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6330196Z Accessible by all: FALSE 2025-12-04T11:12:31.6330346Z Pool 2 2025-12-04T11:12:31.6330489Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:12:31.6330651Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6330811Z Allocatable: TRUE 2025-12-04T11:12:31.6330977Z Alloc Granule: 4KB 2025-12-04T11:12:31.6331152Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6331330Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6331501Z Accessible by all: FALSE 2025-12-04T11:12:31.6331652Z Pool 3 2025-12-04T11:12:31.6331792Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:12:31.6331956Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6332115Z Allocatable: TRUE 2025-12-04T11:12:31.6332282Z Alloc Granule: 4KB 2025-12-04T11:12:31.6332451Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6332625Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6332795Z Accessible by all: FALSE 2025-12-04T11:12:31.6332944Z Pool 4 2025-12-04T11:12:31.6333134Z Segment: GROUP 2025-12-04T11:12:31.6333289Z Size: 64(0x40) KB 2025-12-04T11:12:31.6333446Z Allocatable: FALSE 2025-12-04T11:12:31.6333613Z Alloc Granule: 0KB 2025-12-04T11:12:31.6333791Z Alloc Recommended Granule:0KB 2025-12-04T11:12:31.6333965Z Alloc Alignment: 0KB 2025-12-04T11:12:31.6334134Z Accessible by all: FALSE 2025-12-04T11:12:31.6334284Z ISA Info: 2025-12-04T11:12:31.6334400Z ISA 1 2025-12-04T11:12:31.6334543Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T11:12:31.6334714Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:12:31.6334883Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:12:31.6335049Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6335218Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6335376Z Fast f16: TRUE 2025-12-04T11:12:31.6335578Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6335726Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6335857Z x 1024(0x400) 2025-12-04T11:12:31.6335991Z y 1024(0x400) 2025-12-04T11:12:31.6336126Z z 1024(0x400) 2025-12-04T11:12:31.6336271Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6336413Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6336536Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6336675Z y 65535(0xffff) 2025-12-04T11:12:31.6336809Z z 65535(0xffff) 2025-12-04T11:12:31.6336958Z FBarrier Max Size: 32 2025-12-04T11:12:31.6337101Z ISA 2 2025-12-04T11:12:31.6337248Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T11:12:31.6337426Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:12:31.6337593Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:12:31.6337757Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6337925Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6338081Z Fast f16: TRUE 2025-12-04T11:12:31.6338278Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6338429Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6338561Z x 1024(0x400) 2025-12-04T11:12:31.6338694Z y 1024(0x400) 2025-12-04T11:12:31.6338833Z z 1024(0x400) 2025-12-04T11:12:31.6338980Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6339121Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6339248Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6339380Z y 65535(0xffff) 2025-12-04T11:12:31.6339513Z z 65535(0xffff) 2025-12-04T11:12:31.6339662Z FBarrier Max Size: 32 2025-12-04T11:12:31.6339800Z ******* 2025-12-04T11:12:31.6339903Z Agent 5 2025-12-04T11:12:31.6340048Z ******* 2025-12-04T11:12:31.6340166Z Name: gfx942 2025-12-04T11:12:31.6340311Z Uuid: GPU-1385052698a87313 2025-12-04T11:12:31.6340473Z Marketing Name: AMD Radeon Graphics 2025-12-04T11:12:31.6340638Z Vendor Name: AMD 2025-12-04T11:12:31.6340797Z Feature: KERNEL_DISPATCH 2025-12-04T11:12:31.6340950Z Profile: BASE_PROFILE 2025-12-04T11:12:31.6341106Z Float Round Mode: NEAR 2025-12-04T11:12:31.6341263Z Max Queue Number: 128(0x80) 2025-12-04T11:12:31.6341418Z Queue Min Size: 64(0x40) 2025-12-04T11:12:31.6341570Z Queue Max Size: 131072(0x20000) 2025-12-04T11:12:31.6341726Z Queue Type: MULTI 2025-12-04T11:12:31.6341871Z Node: 4 2025-12-04T11:12:31.6342016Z Device Type: GPU 2025-12-04T11:12:31.6342152Z Cache Info: 2025-12-04T11:12:31.6342313Z L1: 32(0x20) KB 2025-12-04T11:12:31.6342447Z L2: 4096(0x1000) KB 2025-12-04T11:12:31.6342578Z L3: 262144(0x40000) KB 2025-12-04T11:12:31.6342716Z Chip ID: 29861(0x74a5) 2025-12-04T11:12:31.6342864Z ASIC Revision: 1(0x1) 2025-12-04T11:12:31.6343021Z Cacheline Size: 128(0x80) 2025-12-04T11:12:31.6343179Z Max Clock Freq. (MHz): 2100 2025-12-04T11:12:31.6343328Z BDFID: 58624 2025-12-04T11:12:31.6343482Z Internal Node ID: 4 2025-12-04T11:12:31.6343640Z Compute Unit: 304 2025-12-04T11:12:31.6343791Z SIMDs per CU: 4 2025-12-04T11:12:31.6343954Z Shader Engines: 32 2025-12-04T11:12:31.6344115Z Shader Arrs. per Eng.: 1 2025-12-04T11:12:31.6344278Z WatchPts on Addr. Ranges:4 2025-12-04T11:12:31.6344442Z Coherent Host Access: FALSE 2025-12-04T11:12:31.6344586Z Memory Properties: 2025-12-04T11:12:31.6344705Z Features: KERNEL_DISPATCH 2025-12-04T11:12:31.6344850Z Fast F16 Operation: TRUE 2025-12-04T11:12:31.6345010Z Wavefront Size: 64(0x40) 2025-12-04T11:12:31.6345170Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6345317Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6345447Z x 1024(0x400) 2025-12-04T11:12:31.6345581Z y 1024(0x400) 2025-12-04T11:12:31.6345720Z z 1024(0x400) 2025-12-04T11:12:31.6345863Z Max Waves Per CU: 32(0x20) 2025-12-04T11:12:31.6346023Z Max Work-item Per CU: 2048(0x800) 2025-12-04T11:12:31.6346182Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6346323Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6346444Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6346576Z y 65535(0xffff) 2025-12-04T11:12:31.6346706Z z 65535(0xffff) 2025-12-04T11:12:31.6346931Z Max fbarriers/Workgrp: 32 2025-12-04T11:12:31.6347101Z Packet Processor uCode:: 185 2025-12-04T11:12:31.6347265Z SDMA engine uCode:: 24 2025-12-04T11:12:31.6347428Z IOMMU Support:: None 2025-12-04T11:12:31.6347575Z Pool Info: 2025-12-04T11:12:31.6347686Z Pool 1 2025-12-04T11:12:31.6347820Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:12:31.6347973Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6348126Z Allocatable: TRUE 2025-12-04T11:12:31.6348326Z Alloc Granule: 4KB 2025-12-04T11:12:31.6348492Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6348660Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6348826Z Accessible by all: FALSE 2025-12-04T11:12:31.6348965Z Pool 2 2025-12-04T11:12:31.6349103Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:12:31.6349264Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6349454Z Allocatable: TRUE 2025-12-04T11:12:31.6349614Z Alloc Granule: 4KB 2025-12-04T11:12:31.6349787Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6349958Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6350127Z Accessible by all: FALSE 2025-12-04T11:12:31.6350273Z Pool 3 2025-12-04T11:12:31.6350409Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:12:31.6350572Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6350728Z Allocatable: TRUE 2025-12-04T11:12:31.6350892Z Alloc Granule: 4KB 2025-12-04T11:12:31.6351065Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6351241Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6351409Z Accessible by all: FALSE 2025-12-04T11:12:31.6351556Z Pool 4 2025-12-04T11:12:31.6351689Z Segment: GROUP 2025-12-04T11:12:31.6351839Z Size: 64(0x40) KB 2025-12-04T11:12:31.6351994Z Allocatable: FALSE 2025-12-04T11:12:31.6352156Z Alloc Granule: 0KB 2025-12-04T11:12:31.6352329Z Alloc Recommended Granule:0KB 2025-12-04T11:12:31.6352499Z Alloc Alignment: 0KB 2025-12-04T11:12:31.6352666Z Accessible by all: FALSE 2025-12-04T11:12:31.6352813Z ISA Info: 2025-12-04T11:12:31.6352934Z ISA 1 2025-12-04T11:12:31.6353076Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T11:12:31.6353252Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:12:31.6353422Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:12:31.6353592Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6353765Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6353927Z Fast f16: TRUE 2025-12-04T11:12:31.6354088Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6354383Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6354520Z x 1024(0x400) 2025-12-04T11:12:31.6354659Z y 1024(0x400) 2025-12-04T11:12:31.6354797Z z 1024(0x400) 2025-12-04T11:12:31.6354947Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6355094Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6355222Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6355361Z y 65535(0xffff) 2025-12-04T11:12:31.6355497Z z 65535(0xffff) 2025-12-04T11:12:31.6355648Z FBarrier Max Size: 32 2025-12-04T11:12:31.6355791Z ISA 2 2025-12-04T11:12:31.6355948Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T11:12:31.6356131Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:12:31.6356301Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:12:31.6356515Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6356687Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6356849Z Fast f16: TRUE 2025-12-04T11:12:31.6357009Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6357161Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6357297Z x 1024(0x400) 2025-12-04T11:12:31.6357435Z y 1024(0x400) 2025-12-04T11:12:31.6357572Z z 1024(0x400) 2025-12-04T11:12:31.6357727Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6357873Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6358003Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6358183Z y 65535(0xffff) 2025-12-04T11:12:31.6358328Z z 65535(0xffff) 2025-12-04T11:12:31.6358481Z FBarrier Max Size: 32 2025-12-04T11:12:31.6358625Z ******* 2025-12-04T11:12:31.6358736Z Agent 6 2025-12-04T11:12:31.6358840Z ******* 2025-12-04T11:12:31.6358962Z Name: gfx942 2025-12-04T11:12:31.6359115Z Uuid: GPU-7b47bcc6019ee30a 2025-12-04T11:12:31.6359278Z Marketing Name: AMD Radeon Graphics 2025-12-04T11:12:31.6359447Z Vendor Name: AMD 2025-12-04T11:12:31.6359607Z Feature: KERNEL_DISPATCH 2025-12-04T11:12:31.6359767Z Profile: BASE_PROFILE 2025-12-04T11:12:31.6359929Z Float Round Mode: NEAR 2025-12-04T11:12:31.6360099Z Max Queue Number: 128(0x80) 2025-12-04T11:12:31.6360260Z Queue Min Size: 64(0x40) 2025-12-04T11:12:31.6360419Z Queue Max Size: 131072(0x20000) 2025-12-04T11:12:31.6360578Z Queue Type: MULTI 2025-12-04T11:12:31.6360729Z Node: 5 2025-12-04T11:12:31.6360880Z Device Type: GPU 2025-12-04T11:12:31.6361021Z Cache Info: 2025-12-04T11:12:31.6361145Z L1: 32(0x20) KB 2025-12-04T11:12:31.6361330Z L2: 4096(0x1000) KB 2025-12-04T11:12:31.6361468Z L3: 262144(0x40000) KB 2025-12-04T11:12:31.6361612Z Chip ID: 29861(0x74a5) 2025-12-04T11:12:31.6361766Z ASIC Revision: 1(0x1) 2025-12-04T11:12:31.6361925Z Cacheline Size: 128(0x80) 2025-12-04T11:12:31.6362085Z Max Clock Freq. (MHz): 2100 2025-12-04T11:12:31.6362237Z BDFID: 38144 2025-12-04T11:12:31.6362392Z Internal Node ID: 5 2025-12-04T11:12:31.6362546Z Compute Unit: 304 2025-12-04T11:12:31.6362696Z SIMDs per CU: 4 2025-12-04T11:12:31.6362849Z Shader Engines: 32 2025-12-04T11:12:31.6363016Z Shader Arrs. per Eng.: 1 2025-12-04T11:12:31.6363177Z WatchPts on Addr. Ranges:4 2025-12-04T11:12:31.6363342Z Coherent Host Access: FALSE 2025-12-04T11:12:31.6363534Z Memory Properties: 2025-12-04T11:12:31.6363653Z Features: KERNEL_DISPATCH 2025-12-04T11:12:31.6363799Z Fast F16 Operation: TRUE 2025-12-04T11:12:31.6363958Z Wavefront Size: 64(0x40) 2025-12-04T11:12:31.6364121Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6364271Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6364404Z x 1024(0x400) 2025-12-04T11:12:31.6364545Z y 1024(0x400) 2025-12-04T11:12:31.6364681Z z 1024(0x400) 2025-12-04T11:12:31.6364827Z Max Waves Per CU: 32(0x20) 2025-12-04T11:12:31.6364985Z Max Work-item Per CU: 2048(0x800) 2025-12-04T11:12:31.6365147Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6365297Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6365420Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6365558Z y 65535(0xffff) 2025-12-04T11:12:31.6365693Z z 65535(0xffff) 2025-12-04T11:12:31.6365848Z Max fbarriers/Workgrp: 32 2025-12-04T11:12:31.6366022Z Packet Processor uCode:: 185 2025-12-04T11:12:31.6366192Z SDMA engine uCode:: 24 2025-12-04T11:12:31.6366357Z IOMMU Support:: None 2025-12-04T11:12:31.6366501Z Pool Info: 2025-12-04T11:12:31.6366617Z Pool 1 2025-12-04T11:12:31.6366759Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T11:12:31.6366919Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6367083Z Allocatable: TRUE 2025-12-04T11:12:31.6367253Z Alloc Granule: 4KB 2025-12-04T11:12:31.6367428Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6367600Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6367767Z Accessible by all: FALSE 2025-12-04T11:12:31.6367913Z Pool 2 2025-12-04T11:12:31.6368053Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T11:12:31.6368251Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6368455Z Allocatable: TRUE 2025-12-04T11:12:31.6368618Z Alloc Granule: 4KB 2025-12-04T11:12:31.6368788Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6368961Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6369130Z Accessible by all: FALSE 2025-12-04T11:12:31.6369275Z Pool 3 2025-12-04T11:12:31.6369413Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T11:12:31.6369569Z Size: 268419072(0xfffc000) KB 2025-12-04T11:12:31.6369726Z Allocatable: TRUE 2025-12-04T11:12:31.6369890Z Alloc Granule: 4KB 2025-12-04T11:12:31.6370064Z Alloc Recommended Granule:2048KB 2025-12-04T11:12:31.6370239Z Alloc Alignment: 4KB 2025-12-04T11:12:31.6370407Z Accessible by all: FALSE 2025-12-04T11:12:31.6370552Z Pool 4 2025-12-04T11:12:31.6370686Z Segment: GROUP 2025-12-04T11:12:31.6370878Z Size: 64(0x40) KB 2025-12-04T11:12:31.6371032Z Allocatable: FALSE 2025-12-04T11:12:31.6371196Z Alloc Granule: 0KB 2025-12-04T11:12:31.6371366Z Alloc Recommended Granule:0KB 2025-12-04T11:12:31.6371537Z Alloc Alignment: 0KB 2025-12-04T11:12:31.6371707Z Accessible by all: FALSE 2025-12-04T11:12:31.6371855Z ISA Info: 2025-12-04T11:12:31.6371967Z ISA 1 2025-12-04T11:12:31.6372108Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T11:12:31.6372282Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:12:31.6372452Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:12:31.6372630Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6372802Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6372965Z Fast f16: TRUE 2025-12-04T11:12:31.6373126Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6373278Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6373415Z x 1024(0x400) 2025-12-04T11:12:31.6373555Z y 1024(0x400) 2025-12-04T11:12:31.6373693Z z 1024(0x400) 2025-12-04T11:12:31.6373848Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6373996Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6374131Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6374275Z y 65535(0xffff) 2025-12-04T11:12:31.6374412Z z 65535(0xffff) 2025-12-04T11:12:31.6374566Z FBarrier Max Size: 32 2025-12-04T11:12:31.6374711Z ISA 2 2025-12-04T11:12:31.6374858Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T11:12:31.6375040Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T11:12:31.6375209Z Profiles: HSA_PROFILE_BASE 2025-12-04T11:12:31.6375377Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6375578Z Default Rounding Mode: NEAR 2025-12-04T11:12:31.6375742Z Fast f16: TRUE 2025-12-04T11:12:31.6375904Z Workgroup Max Size: 1024(0x400) 2025-12-04T11:12:31.6376059Z Workgroup Max Size per Dimension: 2025-12-04T11:12:31.6376195Z x 1024(0x400) 2025-12-04T11:12:31.6376335Z y 1024(0x400) 2025-12-04T11:12:31.6376468Z z 1024(0x400) 2025-12-04T11:12:31.6376618Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T11:12:31.6376764Z Grid Max Size per Dimension: 2025-12-04T11:12:31.6376888Z x 2147483647(0x7fffffff) 2025-12-04T11:12:31.6377023Z y 65535(0xffff) 2025-12-04T11:12:31.6377164Z z 65535(0xffff) 2025-12-04T11:12:31.6377320Z FBarrier Max Size: 32 2025-12-04T11:12:31.6377459Z *** Done *** 2025-12-04T11:12:31.6385285Z + rocminfo 2025-12-04T11:12:31.6387736Z + grep -E 'Name:.*\sgfx|Marketing' 2025-12-04T11:12:31.7230121Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T11:12:31.7230514Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T11:12:31.7230887Z Name: gfx942 2025-12-04T11:12:31.7231179Z Marketing Name: AMD Radeon Graphics 2025-12-04T11:12:31.7231505Z Name: gfx942 2025-12-04T11:12:31.7231859Z Marketing Name: AMD Radeon Graphics 2025-12-04T11:12:31.7232146Z Name: gfx942 2025-12-04T11:12:31.7232514Z Marketing Name: AMD Radeon Graphics 2025-12-04T11:12:31.7232853Z Name: gfx942 2025-12-04T11:12:31.7233131Z Marketing Name: AMD Radeon Graphics 2025-12-04T11:12:31.7336023Z + MAYBE_ROCM=rocm/ 2025-12-04T11:12:31.7336271Z + [[ linux-noble-rocm-py3.12-mi300 == *xpu* ]] 2025-12-04T11:12:31.7336543Z + [[ linux-noble-rocm-py3.12-mi300 != *-bazel-* ]] 2025-12-04T11:12:31.7351507Z + pip_install ninja==1.10.2 2025-12-04T11:12:31.7351760Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-12-04T11:12:31.7352053Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-12-04T11:12:31.9532767Z Collecting ninja==1.10.2 2025-12-04T11:12:31.9810204Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-12-04T11:12:31.9917550Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-12-04T11:12:32.0902101Z Installing collected packages: ninja 2025-12-04T11:12:32.0902414Z Attempting uninstall: ninja 2025-12-04T11:12:32.0913681Z Found existing installation: ninja 1.11.1.4 2025-12-04T11:12:32.0923884Z Uninstalling ninja-1.11.1.4: 2025-12-04T11:12:32.0995586Z Successfully uninstalled ninja-1.11.1.4 2025-12-04T11:12:32.1084511Z Successfully installed ninja-1.10.2 2025-12-04T11:12:32.1495501Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.12/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T11:12:32.1496393Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.12/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T11:12:32.1496911Z + [[ linux-noble-rocm-py3.12-mi300 == *aarch64* ]] 2025-12-04T11:12:32.1497483Z + [[ linux-noble-rocm-py3.12-mi300 == *asan* ]] 2025-12-04T11:12:32.1497755Z + [[ linux-noble-rocm-py3.12-mi300 == *-debug* ]] 2025-12-04T11:12:32.1497944Z + [[ linux-noble-rocm-py3.12-mi300 != *-bazel-* ]] 2025-12-04T11:12:32.1498308Z + echo 'We are not in debug mode: linux-noble-rocm-py3.12-mi300. Expect the assertion to pass' 2025-12-04T11:12:32.1498680Z We are not in debug mode: linux-noble-rocm-py3.12-mi300. Expect the assertion to pass 2025-12-04T11:12:32.1498913Z + cd test 2025-12-04T11:12:32.1499095Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-12-04T11:12:33.1054026Z + [[ distributed == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-12-04T11:12:33.1054507Z + [[ distributed == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-12-04T11:12:33.1054954Z + [[ distributed == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-12-04T11:12:33.1058615Z + DYNAMO_BENCHMARK_FLAGS=() 2025-12-04T11:12:33.1059039Z + [[ distributed == *pr_time_benchmarks* ]] 2025-12-04T11:12:33.1059426Z + [[ distributed == *dynamo_eager* ]] 2025-12-04T11:12:33.1059814Z + [[ distributed == *aot_eager* ]] 2025-12-04T11:12:33.1060161Z + [[ distributed == *aot_inductor* ]] 2025-12-04T11:12:33.1060523Z + [[ distributed == *max_autotune_inductor* ]] 2025-12-04T11:12:33.1060887Z + [[ distributed == *inductor* ]] 2025-12-04T11:12:33.1061627Z + [[ distributed == *dynamic* ]] 2025-12-04T11:12:33.1061954Z + [[ distributed == *cpu* ]] 2025-12-04T11:12:33.1062255Z + [[ distributed == *xpu* ]] 2025-12-04T11:12:33.1062546Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-12-04T11:12:33.1077783Z + [[ linux-noble-rocm-py3.12-mi300 == *libtorch* ]] 2025-12-04T11:12:33.1078041Z + [[ linux-noble-rocm-py3.12-mi300 == *-bazel-* ]] 2025-12-04T11:12:33.1081392Z + cd test 2025-12-04T11:12:33.1081591Z + python -c 'import torch; print(torch.__config__.show())' 2025-12-04T11:12:33.9106865Z PyTorch built with: 2025-12-04T11:12:33.9107173Z - GCC 11.5 2025-12-04T11:12:33.9107342Z - C++ Version: 201703 2025-12-04T11:12:33.9107775Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-12-04T11:12:33.9108255Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-12-04T11:12:33.9108534Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-12-04T11:12:33.9108748Z - LAPACK is enabled (usually provided by MKL) 2025-12-04T11:12:33.9108977Z - NNPACK is enabled 2025-12-04T11:12:33.9109150Z - CPU capability usage: AVX512 2025-12-04T11:12:33.9109340Z - HIP Runtime 7.1.25424 2025-12-04T11:12:33.9109505Z - MIOpen 3.5.1 2025-12-04T11:12:33.9109648Z - Magma 2.9.0 2025-12-04T11:12:33.9112112Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_FBGEMM_GENAI -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.10.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=ON, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-12-04T11:12:33.9114500Z 2025-12-04T11:12:34.1810504Z + cd test 2025-12-04T11:12:34.1810894Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-12-04T11:12:34.8973669Z ATen/Parallel: 2025-12-04T11:12:34.8974049Z at::get_num_threads() : 128 2025-12-04T11:12:34.8974895Z at::get_num_interop_threads() : 128 2025-12-04T11:12:34.8975158Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-12-04T11:12:34.8975408Z omp_get_max_threads() : 128 2025-12-04T11:12:34.8975855Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-12-04T11:12:34.8976311Z mkl_get_max_threads() : 128 2025-12-04T11:12:34.8976623Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-12-04T11:12:34.8976967Z std::thread::hardware_concurrency() : 256 2025-12-04T11:12:34.8977215Z Environment variables: 2025-12-04T11:12:34.8977445Z OMP_NUM_THREADS : [not set] 2025-12-04T11:12:34.8977661Z MKL_NUM_THREADS : [not set] 2025-12-04T11:12:34.8977886Z ATen parallel backend: OpenMP 2025-12-04T11:12:34.8978032Z 2025-12-04T11:12:35.1440704Z + [[ distributed == *numpy_2* ]] 2025-12-04T11:12:35.1441012Z + [[ linux-noble-rocm-py3.12-mi300 == *aarch64* ]] 2025-12-04T11:12:35.1441285Z + [[ distributed == *backward* ]] 2025-12-04T11:12:35.1441541Z + [[ distributed == *libtorch_agnostic_targetting* ]] 2025-12-04T11:12:35.1441791Z + [[ distributed == *xla* ]] 2025-12-04T11:12:35.1441992Z + [[ distributed == *vllm* ]] 2025-12-04T11:12:35.1442199Z + [[ distributed == *executorch* ]] 2025-12-04T11:12:35.1442429Z + [[ distributed == \j\i\t\_\l\e\g\a\c\y ]] 2025-12-04T11:12:35.1442944Z + [[ distributed == \q\u\a\n\t\i\z\a\t\i\o\n ]] 2025-12-04T11:12:35.1443214Z + [[ linux-noble-rocm-py3.12-mi300 == *libtorch* ]] 2025-12-04T11:12:35.1443466Z + [[ distributed == distributed ]] 2025-12-04T11:12:35.1443672Z + test_distributed 2025-12-04T11:12:35.1443866Z + echo 'Testing distributed python tests' 2025-12-04T11:12:35.1444100Z Testing distributed python tests 2025-12-04T11:12:35.1444383Z + python test/run_test.py --distributed-tests --shard 2 3 --verbose 2025-12-04T11:12:36.9250016Z Excluding distributed/rpc/test_faulty_agent on ROCm 2025-12-04T11:12:36.9250458Z Excluding distributed/rpc/test_tensorpipe_agent on ROCm 2025-12-04T11:12:36.9250844Z Excluding distributed/rpc/test_share_memory on ROCm 2025-12-04T11:12:36.9251199Z Excluding distributed/rpc/cuda/test_tensorpipe_agent on ROCm 2025-12-04T11:12:37.8489565Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-12-04T11:12:38.1845385Z Ignoring disabled issues: [''] 2025-12-04T11:12:38.1894510Z Found test times from artifacts 2025-12-04T11:12:38.2067316Z Found test times from artifacts 2025-12-04T11:12:38.2072699Z Running all tests 2025-12-04T11:12:38.2118983Z Running parallel tests on 1 processes 2025-12-04T11:12:38.2119615Z Name: tests to run (est. time: 120.12min) 2025-12-04T11:12:38.2119950Z Serial tests (74): 2025-12-04T11:12:38.2120223Z distributed/test_dynamo_distributed 1/1 2025-12-04T11:12:38.2120535Z distributed/pipelining/test_backward 1/1 2025-12-04T11:12:38.2120827Z distributed/tensor/test_dtensor 1/1 2025-12-04T11:12:38.2121112Z distributed/tensor/test_redistribute 2/2 2025-12-04T11:12:38.2121433Z distributed/tensor/test_xla_integration 1/1 2025-12-04T11:12:38.2121762Z distributed/checkpoint/_experimental/test_types 1/1 2025-12-04T11:12:38.2122145Z distributed/tensor/experimental/test_register_sharding 1/1 2025-12-04T11:12:38.2122495Z distributed/tensor/test_tensor_ops 1/1 2025-12-04T11:12:38.2122809Z distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 2025-12-04T11:12:38.2123144Z distributed/tensor/debug/test_comm_mode_features 1/1 2025-12-04T11:12:38.2123463Z distributed/tensor/test_dtensor_ops 1/1 2025-12-04T11:12:38.2123740Z distributed/tensor/test_init 1/1 2025-12-04T11:12:38.2124022Z distributed/_composable/test_checkpoint 1/1 2025-12-04T11:12:38.2124321Z distributed/_tools/test_fsdp2_mem_tracker 1/1 2025-12-04T11:12:38.2124634Z distributed/checkpoint/e2e/test_fine_tuning 1/1 2025-12-04T11:12:38.2124941Z distributed/tensor/test_matrix_ops 1/1 2025-12-04T11:12:38.2125227Z distributed/pipelining/test_stage 1/1 2025-12-04T11:12:38.2126155Z distributed/tensor/parallel/test_tp_random_state 1/1 2025-12-04T11:12:38.2126475Z distributed/checkpoint/test_planner 1/1 2025-12-04T11:12:38.2126784Z distributed/checkpoint/test_dtensor_checkpoint 1/1 2025-12-04T11:12:38.2127098Z distributed/pipelining/test_schedule 1/1 2025-12-04T11:12:38.2127438Z distributed/_composable/fsdp/test_fully_shard_overlap 1/1 2025-12-04T11:12:38.2127761Z distributed/test_run 1/1 2025-12-04T11:12:38.2128014Z distributed/tensor/test_math_ops 1/1 2025-12-04T11:12:38.2128402Z distributed/test_functional_api 1/1 2025-12-04T11:12:38.2128733Z distributed/_composable/fsdp/test_fully_shard_compile 1/1 2025-12-04T11:12:38.2129068Z distributed/_composable/test_replicate 1/1 2025-12-04T11:12:38.2129363Z distributed/checkpoint/test_pg_transport 1/1 2025-12-04T11:12:38.2129723Z distributed/_composable/fsdp/test_fully_shard_mixed_precision 1/1 2025-12-04T11:12:38.2130069Z distributed/checkpoint/test_utils 1/1 2025-12-04T11:12:38.2130421Z distributed/checkpoint/_experimental/test_checkpoint_process 1/1 2025-12-04T11:12:38.2130752Z distributed/test_c10d_logger 1/1 2025-12-04T11:12:38.2130975Z distributed/_composable/test_replicate_training 1/1 2025-12-04T11:12:38.2131230Z distributed/optim/test_apply_optimizer_in_backward 1/1 2025-12-04T11:12:38.2131623Z distributed/fsdp/test_fsdp_uneven 1/1 2025-12-04T11:12:38.2131843Z distributed/tensor/test_op_strategy 1/1 2025-12-04T11:12:38.2132041Z distributed/fsdp/test_fsdp_grad_acc 1/1 2025-12-04T11:12:38.2132261Z distributed/checkpoint/test_state_dict_stager 1/1 2025-12-04T11:12:38.2132497Z distributed/fsdp/test_fsdp_freezing_weights 1/1 2025-12-04T11:12:38.2132748Z distributed/_composable/fsdp/test_fully_shard_init 1/1 2025-12-04T11:12:38.2132992Z distributed/fsdp/test_fsdp_exec_order 1/1 2025-12-04T11:12:38.2133206Z distributed/fsdp/test_fsdp_flatten_params 1/1 2025-12-04T11:12:38.2133417Z distributed/test_distributed_spawn 3/7 2025-12-04T11:12:38.2133627Z distributed/test_distributed_spawn 6/7 2025-12-04T11:12:38.2133830Z distributed/fsdp/test_fsdp_traversal 1/1 2025-12-04T11:12:38.2134037Z distributed/test_serialization 1/1 2025-12-04T11:12:38.2134267Z distributed/fsdp/test_fsdp_multiple_wrapping 1/1 2025-12-04T11:12:38.2134502Z distributed/fsdp/test_fsdp_ignored_modules 1/1 2025-12-04T11:12:38.2134735Z distributed/fsdp/test_checkpoint_wrapper 1/1 2025-12-04T11:12:38.2134949Z distributed/fsdp/test_fsdp_checkpoint 1/1 2025-12-04T11:12:38.2135155Z distributed/fsdp/test_fsdp_fine_tune 1/1 2025-12-04T11:12:38.2135377Z distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 2025-12-04T11:12:38.2135608Z distributed/fsdp/test_fsdp_comm_hooks 1/1 2025-12-04T11:12:38.2135818Z distributed/fsdp/test_fsdp_hybrid_shard 1/1 2025-12-04T11:12:38.2136024Z distributed/_shard/test_sharder 1/1 2025-12-04T11:12:38.2136263Z distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 2025-12-04T11:12:38.2136532Z distributed/_shard/sharding_plan/test_sharding_plan 1/1 2025-12-04T11:12:38.2136767Z distributed/fsdp/test_fsdp_comm 1/1 2025-12-04T11:12:38.2136966Z distributed/test_c10d_pypg 1/1 2025-12-04T11:12:38.2137155Z distributed/test_pg_wrapper 1/1 2025-12-04T11:12:38.2137351Z distributed/tensor/test_utils 1/1 2025-12-04T11:12:38.2137561Z distributed/fsdp/test_fsdp_unshard_params 1/1 2025-12-04T11:12:38.2137801Z distributed/checkpoint/test_state_dict_utils 1/1 2025-12-04T11:12:38.2138037Z distributed/_shard/sharded_tensor/ops/test_init 1/1 2025-12-04T11:12:38.2138335Z distributed/_shard/sharded_tensor/ops/test_embedding 1/1 2025-12-04T11:12:38.2138613Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 2025-12-04T11:12:38.2138904Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 2025-12-04T11:12:38.2139155Z distributed/fsdp/test_fsdp_core 1/3 2025-12-04T11:12:38.2139353Z distributed/test_c10d_spawn_gloo 1/1 2025-12-04T11:12:38.2139549Z distributed/test_c10d_spawn_ucc 1/1 2025-12-04T11:12:38.2139801Z distributed/test_c10d_gloo 1/2 2025-12-04T11:12:38.2140006Z distributed/fsdp/test_fsdp_mixed_precision 1/1 2025-12-04T11:12:38.2140215Z distributed/test_c10d_nccl 2/3 2025-12-04T11:12:38.2140402Z distributed/elastic/timer/api_test 1/1 2025-12-04T11:12:38.2140591Z Parallel tests (0): 2025-12-04T11:12:38.2140780Z Name: excluded (est. time: 0.0min) 2025-12-04T11:12:38.2140914Z Serial tests (0): 2025-12-04T11:12:38.2141035Z Parallel tests (0): 2025-12-04T11:12:38.2141251Z Running distributed/test_dynamo_distributed 1/1 ... [2025-12-04 11:12:38.212203][2286256.861384774] 2025-12-04T11:12:38.2141502Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:12:38.2142004Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_dynamo_distributed.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:12:38.212405] 2025-12-04T11:20:29.1645267Z 2025-12-04T11:20:29.1646314Z distributed/test_dynamo_distributed 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_dynamo_distributed_1.1_667e69f56c0d2ea5_.log 2025-12-04T11:20:29.1660367Z Running 62 items in this shard: test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_call_method_forward, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_ddp_optimizer_inductor_strides_dont_specialize, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_hf_bert_ddp_aot_eager, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_hf_bert_ddp_inductor, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_issue90375, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_symbol_splitting, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_direct, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_indirect, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_no_binding, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_torture_multi, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_asymmetric_compilation, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_asymmetric_compilation_with_fx_cache, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_automatic_dynamic_scalar, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_automatic_dynamic_speculation_divergence, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_automatic_dynamic_tensor, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_dim_mismatch, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_graph_break_empty_graph_still_collective, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_missing_source, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_scalar_missing_source, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_type_mismatch, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_ddp_activation_checkpointing, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_ddp_baseline_aot_eager_multiprocess, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_ddp_optimizer_cudagraph, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_activation_checkpointing, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_aot_eager, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_inductor, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_setattr, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_unspecialized_forced_getattr_inline, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_unspecialized_forced_getattr_no_inline, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_get_pg_attr, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_guard_collective, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_aot_eager, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_aot_eager_static_graph, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_inductor, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_inductor_static_graph, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_fsdp, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_fsdp_activation_checkpointing, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_multiproc_autotune, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_multiproc_autotune_dynamic_shapes, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_aot_autograd, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_async_subclass_no_specialize, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_compiled_flex_attention_full_model_ddp, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_compiled_flex_attention_local_ddp, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_custom_layer, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_ddp_baseline_aot_eager, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_ddp_baseline_inductor, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_empty_graph_inductor, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_dup_tensors_diff_source, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_dup_tensors_same_source, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_orig_params_assert, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_skip_guards, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_skip_register_attr_or_module, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_staticmethod, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_ctx_manager, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor_layout_optimizations_inference, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor_layout_optimizations_training, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor_transpose, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_higher_order_op, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_ignored_parameters, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_no_split 2025-12-04T11:20:29.1670466Z 2025-12-04T11:20:29.1670610Z Finished distributed/test_dynamo_distributed 1/1 ... [2025-12-04 11:20:29.164543][2286727.813723325], took 7.85min 2025-12-04T11:20:29.1671086Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:20:31.2085956Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:20:31.2086574Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T11:20:31.2086976Z Uploading artifacts took 0.00 seconds 2025-12-04T11:20:31.2087433Z Running distributed/pipelining/test_backward 1/1 ... [2025-12-04 11:20:31.208236][2286729.857415227] 2025-12-04T11:20:31.2088760Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:20:31.2089664Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/pipelining/test_backward.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:20:31.208496] 2025-12-04T11:20:37.5320812Z 2025-12-04T11:20:37.5322034Z distributed/pipelining/test_backward 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_backward_1.1_bb427c1284ca5bca_.log 2025-12-04T11:20:37.5324250Z Running 5 items in this shard: test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_input_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_weight_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_weight_grad_validation_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_weight_multiple_iters_cuda 2025-12-04T11:20:37.5325788Z 2025-12-04T11:20:37.5326047Z Finished distributed/pipelining/test_backward 1/1 ... [2025-12-04 11:20:37.531714][2286736.180893869], took 0.11min 2025-12-04T11:20:37.5327447Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:20:37.5345203Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:20:37.5346339Z Running distributed/tensor/test_dtensor 1/1 ... [2025-12-04 11:20:37.534498][2286736.18368196] 2025-12-04T11:20:37.5346628Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:20:37.5348056Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_dtensor.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:20:37.534676] 2025-12-04T11:23:34.3164353Z 2025-12-04T11:23:34.3168654Z distributed/tensor/test_dtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_dtensor_1.1_23eb169f26f938b6_.log 2025-12-04T11:23:34.3185342Z Running 86 items in this shard: test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_async_output, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_constructor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_new_empty_strided, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_properties, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_save_load, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_save_load_import, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_spec_hash, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_spec_read_only_after_set, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_stride, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local_negative_dim, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local_then_to_local, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local_uneven_sharding, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local_uneven_sharding_raise_error, test/distributed/tensor/test_dtensor.py::DTensorTest::test_full_tensor_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorTest::test_full_tensor_sync, test/distributed/tensor/test_dtensor.py::DTensorTest::test_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_modules_w_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_shard_tensor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_shard_tensor_2d, test/distributed/tensor/test_dtensor.py::DTensorTest::test_to_local, test/distributed/tensor/test_dtensor.py::DTensorTest::test_to_local_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_async_output, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_constructor, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_new_empty_strided, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_properties, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_save_load, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_save_load_import, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_spec_hash, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_spec_read_only_after_set, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_stride, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_negative_dim, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_then_to_local, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_uneven_sharding, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_uneven_sharding_raise_error, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_full_tensor_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_full_tensor_sync, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_modules_w_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_shard_tensor, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_shard_tensor_2d, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_to_local, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_to_local_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_as_strided_identity, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_auto_implicit_replication, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_default_value_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_device_mesh_nd, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_2d_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_api_device_mesh_context_manager, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_cond, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_device_mesh_device_conversion, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_spec_local_shard_offset, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_from_local_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_implicit_replication, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_inplace_on_local_tensor_view, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_metadata_consistency_check, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_redistribute_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_vmap_embedding, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_as_strided_identity, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_auto_implicit_replication, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_default_value_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_device_mesh_nd, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_2d_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_api_device_mesh_context_manager, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_cond, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_device_mesh_device_conversion, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_spec_local_shard_offset, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_from_local_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_implicit_replication, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_inplace_on_local_tensor_view, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_metadata_consistency_check, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_redistribute_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_vmap_embedding, test/distributed/tensor/test_dtensor.py::TestDTensorPlacementTypes::test_split_tensor_1D, test/distributed/tensor/test_dtensor.py::TestDTensorPlacementTypesWithLocalTensor::test_split_tensor_1D, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_default_shard_order, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_default_shard_order_generation, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_print, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_update, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_with_invalid_shard_order, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_default_shard_order, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_default_shard_order_generation, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_print, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_update, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_with_invalid_shard_order 2025-12-04T11:23:34.3197435Z 2025-12-04T11:23:34.3197563Z Finished distributed/tensor/test_dtensor 1/1 ... [2025-12-04 11:23:34.316616][2286912.9657978], took 2.95min 2025-12-04T11:23:34.3198017Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:23:34.3198466Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:23:34.3198715Z Running distributed/tensor/test_redistribute 2/2 ... [2025-12-04 11:23:34.318729][2286912.967913642] 2025-12-04T11:23:34.3198928Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:23:34.3199345Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_redistribute.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:23:34.318904] 2025-12-04T11:24:38.1808962Z 2025-12-04T11:24:38.1810065Z distributed/tensor/test_redistribute 2/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_redistribute_2.2_f8b988b9ca5f7ec2_.log 2025-12-04T11:24:38.1821074Z Running 33 items in this shard: test/distributed/tensor/test_redistribute.py::RedistributeTest::test_one_chunk_mesh, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_partial_to_replicate_forward_backward_float32, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_partial_to_shard_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_local_partial_grad_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_local_partial_grad_float32, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_replicate_forward_backward_datatype_conversion, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_shard_forward_backward, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_shard_dim_alltoall_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_shard_dim_alltoall_float32, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_shard_to_replicate_forward_backward_complex64, test/distributed/tensor/test_redistribute.py::MultiDimRedistributeTest::test_redistribute_shard_dim_multi_dim_mesh, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_generate_shard_orders, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_ordered_distribute_all_combination, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_ordered_redistribute_with_partial, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_shard_order_same_data_as_strided_shard, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_one_chunk_mesh, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_partial_to_replicate_forward_backward_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_partial_to_replicate_forward_backward_float32, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_partial_to_shard_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_redistribute_negative_shard_dim, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_redistribute_to_partial, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_redistribute_uneven_sharding, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_replicate_to_partial, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_replicate_to_replicate_forward_backward, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_replicate_to_replicate_forward_backward_datatype_conversion, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_shard_dim_alltoall_float32, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_shard_to_replicate_forward_backward_datatype_conversion, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_shard_to_replicate_forward_backward_float32, test/distributed/tensor/test_redistribute.py::MultiDimRedistributeTestWithLocalTensor::test_multi_dim_mesh, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_generate_shard_orders, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_ordered_redistribute, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_ordered_redistribute_for_special_placement, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_ordered_redistribute_with_partial 2025-12-04T11:24:38.1831224Z 2025-12-04T11:24:38.1831460Z Finished distributed/tensor/test_redistribute 2/2 ... [2025-12-04 11:24:38.180603][2286976.829783644], took 1.06min 2025-12-04T11:24:38.1832188Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:24:38.1832838Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:24:38.1834336Z Running distributed/tensor/test_xla_integration 1/1 ... [2025-12-04 11:24:38.183325][2286976.832509256] 2025-12-04T11:24:38.1834672Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:24:38.1836431Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_xla_integration.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:24:38.183496] 2025-12-04T11:24:40.3514762Z 2025-12-04T11:24:40.3515866Z distributed/tensor/test_xla_integration 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_xla_integration_1.1_4e7c95da93c4644a_.log 2025-12-04T11:24:40.3517577Z Running 3 items in this shard: test/distributed/tensor/test_xla_integration.py::DTensorXLAIntegrationTest::test_xla_distribute_tensor_1d_replicate, test/distributed/tensor/test_xla_integration.py::DTensorXLAIntegrationTest::test_xla_distribute_tensor_1d_shard, test/distributed/tensor/test_xla_integration.py::DTensorXLAIntegrationTest::test_xla_distribute_tensor_2d 2025-12-04T11:24:40.3518959Z 2025-12-04T11:24:40.3519247Z Finished distributed/tensor/test_xla_integration 1/1 ... [2025-12-04 11:24:40.351098][2286979.000277074], took 0.04min 2025-12-04T11:24:40.3520167Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:24:40.3539370Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:24:40.3542059Z Running distributed/checkpoint/_experimental/test_types 1/1 ... [2025-12-04 11:24:40.354068][2286979.003251882] 2025-12-04T11:24:40.3542407Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:24:40.3543760Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_types.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:24:40.354240] 2025-12-04T11:24:42.5722294Z 2025-12-04T11:24:42.5723686Z distributed/checkpoint/_experimental/test_types 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_types_1.1_5a37802355b2ddd8_.log 2025-12-04T11:24:42.5725475Z Running 3 items in this shard: test/distributed/checkpoint/_experimental/test_types.py::TestRankInfo::test_rank_info_default_initialization, test/distributed/checkpoint/_experimental/test_types.py::TestRankInfo::test_rank_info_initialization, test/distributed/checkpoint/_experimental/test_types.py::TestRankInfo::test_state_dict_type_alias 2025-12-04T11:24:42.5726710Z 2025-12-04T11:24:42.5727027Z Finished distributed/checkpoint/_experimental/test_types 1/1 ... [2025-12-04 11:24:42.571896][2286981.22107634], took 0.04min 2025-12-04T11:24:42.5727984Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:24:42.5747207Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:24:42.5749737Z Running distributed/tensor/experimental/test_register_sharding 1/1 ... [2025-12-04 11:24:42.574846][2286981.224030038] 2025-12-04T11:24:42.5750124Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:24:42.5751661Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/experimental/test_register_sharding.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:24:42.575031] 2025-12-04T11:24:58.3624288Z 2025-12-04T11:24:58.3625908Z distributed/tensor/experimental/test_register_sharding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.experimental.test_register_sharding_1.1_f0eac74a87d7a376_.log 2025-12-04T11:24:58.3628502Z Running 3 items in this shard: test/distributed/tensor/experimental/test_register_sharding.py::TestRegisterSharding::test_argmax, test/distributed/tensor/experimental/test_register_sharding.py::TestRegisterSharding::test_register_sharding_for_tensor_kwargs, test/distributed/tensor/experimental/test_register_sharding.py::TestRegisterSharding::test_softmax_fwd 2025-12-04T11:24:58.3629082Z 2025-12-04T11:24:58.3629250Z Finished distributed/tensor/experimental/test_register_sharding 1/1 ... [2025-12-04 11:24:58.362059][2286997.011239099], took 0.26min 2025-12-04T11:24:58.3629762Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:24:58.3647822Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:24:58.3650373Z Running distributed/tensor/test_tensor_ops 1/1 ... [2025-12-04 11:24:58.364903][2286997.014086689] 2025-12-04T11:24:58.3650834Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:24:58.3652111Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_tensor_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:24:58.365079] 2025-12-04T11:27:25.6719710Z 2025-12-04T11:27:25.6720724Z distributed/tensor/test_tensor_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_tensor_ops_1.1_f0e0b15364e85b24_.log 2025-12-04T11:27:25.6734062Z Running 62 items in this shard: test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_aten_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_clone, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_copy_, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_detach, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_dtensor_dtype_conversion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_empty_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_equal, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_fill_inplace, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_fill_inplace_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_full_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_gather, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_index, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_index_put_scalar, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_index_put_tensor, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_inplace_op, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_new_empty_strided, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_new_full, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_ones_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_ones_like_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_op_out_variant, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_scatter, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_slice, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_split_on_partial, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_stack, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_stack_cache, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_unbind, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_where_type_promotion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_zero_inplace, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_zeros_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_zeros_like_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_aten_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_clone, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_copy_, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_detach, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_dtensor_dtype_conversion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_empty_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_equal, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_fill_inplace, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_fill_inplace_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_full_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_gather, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_index, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_index_put_scalar, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_index_put_tensor, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_inplace_op, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_new_empty_strided, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_new_full, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_ones_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_ones_like_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_op_out_variant, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_scatter, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_slice, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_split_on_partial, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_stack, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_stack_cache, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_unbind, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_where_type_promotion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_zero_inplace, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_zeros_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_zeros_like_partial_sum 2025-12-04T11:27:25.6743259Z 2025-12-04T11:27:25.6743416Z Finished distributed/tensor/test_tensor_ops 1/1 ... [2025-12-04 11:27:25.671647][2287144.320827578], took 2.46min 2025-12-04T11:27:25.6743928Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:27:25.6744351Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:27:25.6744610Z Running distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 ... [2025-12-04 11:27:25.674302][2287144.323486351] 2025-12-04T11:27:25.6744823Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:27:25.6745843Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/fsdp/test_fsdp_dsd.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:27:25.674471] 2025-12-04T11:28:13.4636186Z 2025-12-04T11:28:13.4638546Z distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.fsdp.test_fsdp_dsd_1.1_5ae14876d5b52090_.log 2025-12-04T11:28:13.4642503Z Running 6 items in this shard: test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_1d_fsdp_cpu_offload_full_model_state_dict, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_1d_fsdp_get_model_state_dict, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_fsdp1_and_load_with_fsdp2, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_fsdp1_and_load_with_fsdp2_tp, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_fsdp2_tp_and_load_with_tp, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_tp_and_load_with_fsdp2_tp 2025-12-04T11:28:13.4645249Z 2025-12-04T11:28:13.4645588Z Finished distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 ... [2025-12-04 11:28:13.463371][2287192.112550391], took 0.80min 2025-12-04T11:28:13.4646610Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:28:13.4662134Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:28:13.4664651Z Running distributed/tensor/debug/test_comm_mode_features 1/1 ... [2025-12-04 11:28:13.466352][2287192.115535908] 2025-12-04T11:28:13.4666241Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:28:13.4666872Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/debug/test_comm_mode_features.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:28:13.466536] 2025-12-04T11:28:45.0817415Z 2025-12-04T11:28:45.0818508Z distributed/tensor/debug/test_comm_mode_features 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.debug.test_comm_mode_features_1.1_cc58908746ac96e0_.log 2025-12-04T11:28:45.0822597Z Running 4 items in this shard: test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_MLPStacked_distributed_sharding_display, test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_MLP_distributed_sharding_display, test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_MLP_module_tracing, test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_transformer_module_tracing 2025-12-04T11:28:45.0823559Z 2025-12-04T11:28:45.0823740Z Finished distributed/tensor/debug/test_comm_mode_features 1/1 ... [2025-12-04 11:28:45.081381][2287223.73056136], took 0.53min 2025-12-04T11:28:45.0824266Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:28:45.0841592Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:28:45.0844317Z Running distributed/tensor/test_dtensor_ops 1/1 ... [2025-12-04 11:28:45.084305][2287223.733489309] 2025-12-04T11:28:45.0844750Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:28:45.0846032Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_dtensor_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:28:45.084474] 2025-12-04T11:28:48.2508068Z 2025-12-04T11:28:48.2509253Z distributed/tensor/test_dtensor_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_dtensor_ops_1.1_e7e03ffb1fd8c0ba_.log 2025-12-04T11:28:48.2509792Z Running 0 items in this shard: 2025-12-04T11:28:48.2510354Z 2025-12-04T11:28:48.2510570Z Finished distributed/tensor/test_dtensor_ops 1/1 ... [2025-12-04 11:28:48.250524][2287226.899704378], took 0.05min 2025-12-04T11:28:48.2512437Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:28:48.2529756Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:28:48.2532570Z Running distributed/tensor/test_init 1/1 ... [2025-12-04 11:28:48.253172][2287226.902355781] 2025-12-04T11:28:48.2532836Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:28:48.2534718Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_init.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:28:48.253342] 2025-12-04T11:29:22.4704135Z 2025-12-04T11:29:22.4707798Z distributed/tensor/test_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_init_1.1_302246374c6efe1a_.log 2025-12-04T11:29:22.4710000Z Running 13 items in this shard: test/distributed/tensor/test_init.py::DTensorInitOpsTest::test_init_ops, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_empty, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_full, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_ones, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_zeros, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_zeros_full_mesh, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_zeros_submesh, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_empty, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_full, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_ones, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_zeros, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_zeros_full_mesh, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_zeros_submesh 2025-12-04T11:29:22.4712239Z 2025-12-04T11:29:22.4712364Z Finished distributed/tensor/test_init 1/1 ... [2025-12-04 11:29:22.470023][2287261.11920278], took 0.57min 2025-12-04T11:29:22.4714496Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:29:22.4725555Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:29:22.4728011Z Running distributed/_composable/test_checkpoint 1/1 ... [2025-12-04 11:29:22.472703][2287261.121886643] 2025-12-04T11:29:22.4728286Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:29:22.4730149Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/test_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:29:22.472887] 2025-12-04T11:29:28.0456796Z 2025-12-04T11:29:28.0458005Z distributed/_composable/test_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_checkpoint_1.1_1193b4dea4e22f77_.log 2025-12-04T11:29:28.0460984Z Running 6 items in this shard: test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_checkpoint_kwargs, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_clears_state_on_error_in_forward, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_multi_args, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_random_cpu, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_tensor_only_cpu, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_tensor_only_gpu 2025-12-04T11:29:28.0462458Z 2025-12-04T11:29:28.0462724Z Finished distributed/_composable/test_checkpoint 1/1 ... [2025-12-04 11:29:28.045318][2287266.694497399], took 0.09min 2025-12-04T11:29:28.0463570Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:29:28.0480338Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:29:28.0482396Z Running distributed/_tools/test_fsdp2_mem_tracker 1/1 ... [2025-12-04 11:29:28.048152][2287266.697336159] 2025-12-04T11:29:28.0482706Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:29:28.0484287Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_tools/test_fsdp2_mem_tracker.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:29:28.048321] 2025-12-04T11:30:00.9584015Z 2025-12-04T11:30:00.9584872Z distributed/_tools/test_fsdp2_mem_tracker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tools.test_fsdp2_mem_tracker_1.1_ff74fe95d0881805_.log 2025-12-04T11:30:00.9587199Z Running 3 items in this shard: test/distributed/_tools/test_fsdp2_mem_tracker.py::TestTrackerFullyShard1DTrainingCore::test_tracker_multi_group_eager, test/distributed/_tools/test_fsdp2_mem_tracker.py::TestTrackerFullyShard1DTrainingCore::test_tracker_non_root_forward_backward, test/distributed/_tools/test_fsdp2_mem_tracker.py::TestTrackerFullyShard1DTrainingCompose::test_tracker_with_activation_checkpointing 2025-12-04T11:30:00.9588586Z 2025-12-04T11:30:00.9588906Z Finished distributed/_tools/test_fsdp2_mem_tracker 1/1 ... [2025-12-04 11:30:00.958028][2287299.607208383], took 0.55min 2025-12-04T11:30:00.9590389Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:30:00.9611599Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:30:00.9612643Z Running distributed/checkpoint/e2e/test_fine_tuning 1/1 ... [2025-12-04 11:30:00.961074][2287299.610258469] 2025-12-04T11:30:00.9612980Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:30:00.9613830Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/e2e/test_fine_tuning.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:00.961248] 2025-12-04T11:30:20.4067577Z 2025-12-04T11:30:20.4069288Z distributed/checkpoint/e2e/test_fine_tuning 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.e2e.test_fine_tuning_1.1_f4af570b33c9e31a_.log 2025-12-04T11:30:20.4070719Z Running 1 items in this shard: test/distributed/checkpoint/e2e/test_fine_tuning.py::TestFineTuning::test_fine_tuning 2025-12-04T11:30:20.4071246Z 2025-12-04T11:30:20.4071657Z Finished distributed/checkpoint/e2e/test_fine_tuning 1/1 ... [2025-12-04 11:30:20.406549][2287319.055729023], took 0.32min 2025-12-04T11:30:20.4074908Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:30:20.4093365Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:30:20.4097218Z Running distributed/tensor/test_matrix_ops 1/1 ... [2025-12-04 11:30:20.409444][2287319.058628583] 2025-12-04T11:30:20.4097558Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:30:20.4098892Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_matrix_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:20.409619] 2025-12-04T11:31:58.3401623Z 2025-12-04T11:31:58.3402735Z distributed/tensor/test_matrix_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_matrix_ops_1.1_8a1aea47df570e83_.log 2025-12-04T11:31:58.3412338Z Running 30 items in this shard: test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm_auto_redistribute, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm_empty_operand, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_baddbmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_bmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_dtensor_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_grouped_mm_kwargs0, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_grouped_mm_kwargs1, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_matmul, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_scaled_dot_product_attention, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_scaled_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_t, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_t_partial, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_tensordot_shampoo, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_addmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_addmm_auto_redistribute, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_addmm_empty_operand, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_baddbmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_bmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_dtensor_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_grouped_mm_kwargs0, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_grouped_mm_kwargs1, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_matmul, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_scaled_dot_product_attention, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_scaled_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_t, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_t_partial, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_tensordot_shampoo 2025-12-04T11:31:58.3420406Z 2025-12-04T11:31:58.3420598Z Finished distributed/tensor/test_matrix_ops 1/1 ... [2025-12-04 11:31:58.339776][2287416.988955446], took 1.63min 2025-12-04T11:31:58.3421206Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:31:58.3424880Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:31:58.3427346Z Running distributed/pipelining/test_stage 1/1 ... [2025-12-04 11:31:58.342625][2287416.991808566] 2025-12-04T11:31:58.3427570Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:31:58.3429358Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/pipelining/test_stage.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:31:58.342801] 2025-12-04T11:32:25.0473401Z 2025-12-04T11:32:25.0473905Z distributed/pipelining/test_stage 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_stage_1.1_f07eb832c6792751_.log 2025-12-04T11:32:25.0476002Z Running 8 items in this shard: test/distributed/pipelining/test_stage.py::StageTest::test_custom_dw_with_fb_schedule, test/distributed/pipelining/test_stage.py::StageTest::test_manual, test/distributed/pipelining/test_stage.py::StageTest::test_output_chunks_memory_usage, test/distributed/pipelining/test_stage.py::StageTest::test_tracer_ModelClass0, test/distributed/pipelining/test_stage.py::StageTest::test_tracer_ModelClass1, test/distributed/pipelining/test_stage.py::StageTest::test_tracer_kwargs_ModelClass0, test/distributed/pipelining/test_stage.py::StageNegativeTest::test_custom_dw_errors, test/distributed/pipelining/test_stage.py::StageNegativeTest::test_shape_prop_mismatch 2025-12-04T11:32:25.0477598Z 2025-12-04T11:32:25.0477818Z Finished distributed/pipelining/test_stage 1/1 ... [2025-12-04 11:32:25.046954][2287443.696134427], took 0.45min 2025-12-04T11:32:25.0478754Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:32:25.0495172Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:32:25.0497218Z Running distributed/tensor/parallel/test_tp_random_state 1/1 ... [2025-12-04 11:32:25.049630][2287443.69881402] 2025-12-04T11:32:25.0497525Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:32:25.0499405Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/parallel/test_tp_random_state.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:32:25.049798] 2025-12-04T11:32:33.3270671Z 2025-12-04T11:32:33.3271695Z distributed/tensor/parallel/test_tp_random_state 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.parallel.test_tp_random_state_1.1_bdd7d70d1ebe3f35_.log 2025-12-04T11:32:33.3275696Z Running 1 items in this shard: test/distributed/tensor/parallel/test_tp_random_state.py::TensorParallelRandomStateTests::test_model_init 2025-12-04T11:32:33.3276361Z 2025-12-04T11:32:33.3276812Z Finished distributed/tensor/parallel/test_tp_random_state 1/1 ... [2025-12-04 11:32:33.326692][2287451.975871114], took 0.14min 2025-12-04T11:32:33.3279402Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:32:33.3297694Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:32:33.3301535Z Running distributed/checkpoint/test_planner 1/1 ... [2025-12-04 11:32:33.329859][2287451.979043479] 2025-12-04T11:32:33.3301821Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:32:33.3302643Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_planner.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:32:33.330028] 2025-12-04T11:32:35.5478861Z 2025-12-04T11:32:35.5479406Z distributed/checkpoint/test_planner 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_planner_1.1_844b415c886f474f_.log 2025-12-04T11:32:35.5484092Z Running 17 items in this shard: test/distributed/checkpoint/test_planner.py::TestSavePlan::test_dedup_plans, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_finish_plan_with_caching, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_global_plan, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_global_plan_with_caching, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_load_with_resharding, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_load_with_world_size_diff_by_one, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_local_load_plan, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_local_plan, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_local_plan_with_caching, test/distributed/checkpoint/test_planner.py::TestPlannerHelpers::test_compare_save_plans, test/distributed/checkpoint/test_planner.py::TestPlannerHelpers::test_create_read_item_from_chunks, test/distributed/checkpoint/test_planner.py::TestPlannerHelpers::test_merge_delta_local_plans, test/distributed/checkpoint/test_planner.py::TestValidateGlobalPlan::test_detect_overlapping_chunks, test/distributed/checkpoint/test_planner.py::TestValidateGlobalPlan::test_non_overlapping_chunks, test/distributed/checkpoint/test_planner.py::TestLoadPlanner::test_load_different_sizes_throws, test/distributed/checkpoint/test_planner.py::TestLoadPlanner::test_strict, test/distributed/checkpoint/test_planner.py::TestLoadPlanner::test_version_key_in_planner_data 2025-12-04T11:32:35.5487866Z 2025-12-04T11:32:35.5488270Z Finished distributed/checkpoint/test_planner 1/1 ... [2025-12-04 11:32:35.547513][2287454.196693529], took 0.04min 2025-12-04T11:32:35.5489022Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:32:35.5503817Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:32:35.5506667Z Running distributed/checkpoint/test_dtensor_checkpoint 1/1 ... [2025-12-04 11:32:35.550528][2287454.199711956] 2025-12-04T11:32:35.5506955Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:32:35.5508452Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_dtensor_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:32:35.550704] 2025-12-04T11:32:42.8758703Z 2025-12-04T11:32:42.8759850Z distributed/checkpoint/test_dtensor_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_dtensor_checkpoint_1.1_e24346b9f1951dfb_.log 2025-12-04T11:32:42.8760944Z Running 1 items in this shard: test/distributed/checkpoint/test_dtensor_checkpoint.py::DTensorPlanner::test_distributed_tensor_planner 2025-12-04T11:32:42.8761392Z 2025-12-04T11:32:42.8761708Z Finished distributed/checkpoint/test_dtensor_checkpoint 1/1 ... [2025-12-04 11:32:42.875515][2287461.524695434], took 0.12min 2025-12-04T11:32:42.8766176Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:32:42.8784366Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:32:42.8786895Z Running distributed/pipelining/test_schedule 1/1 ... [2025-12-04 11:32:42.878555][2287461.52773934] 2025-12-04T11:32:42.8787239Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:32:42.8788591Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/pipelining/test_schedule.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:32:42.878731] 2025-12-04T11:33:08.3326422Z 2025-12-04T11:33:08.3327089Z distributed/pipelining/test_schedule 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_schedule_1.1_ce7bd12d8f7e2c87_.log 2025-12-04T11:33:08.3334834Z Running 43 items in this shard: test/distributed/pipelining/test_schedule.py::ScheduleTest::test_get_schedule_class, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass0, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass1, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass2, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass3, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass4, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass0, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass1, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass2, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass3, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass4, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_zero_bubble_schedule_errors_with_compile_ScheduleClass0, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_zero_bubble_schedule_errors_with_compile_ScheduleClass1, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_zero_bubble_schedule_errors_with_compile_ScheduleClass2, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_ScheduleClass0, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_ScheduleClass1, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_flex_and_zero_bubble_ScheduleClass0, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_flex_and_zero_bubble_ScheduleClass1, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_for_v_schedules_ScheduleClass0, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_for_v_schedules_ScheduleClass1, test/distributed/pipelining/test_schedule.py::TestScheduleCsv::test_csv_compare_ScheduleClass0_csv_name_dualpipev_4rank_10mb, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref1, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref2, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref3, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref4, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref5, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref6, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref7, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_csv_csv_name_zb1p_2rank_2stagep, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_grad_with_split_b_w, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_grad_with_v_schedule, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_merge_bw_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_reduce_grad_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_reduce_grad_test_info1, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_send_recv_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_send_recv_test_info1, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_unshard_reshard_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_unshard_reshard_test_info1, test/distributed/pipelining/test_schedule.py::TestValidateSchedule::test_invalid_schedule_missing_action, test/distributed/pipelining/test_schedule.py::TestValidateSchedule::test_invalid_schedule_missing_rank, test/distributed/pipelining/test_schedule.py::TestValidateSchedule::test_valid_schedule, test/distributed/pipelining/test_schedule.py::ScheduleUtilTests::test_generate_stage_to_rank_mapping 2025-12-04T11:33:08.3341416Z 2025-12-04T11:33:08.3341556Z Finished distributed/pipelining/test_schedule 1/1 ... [2025-12-04 11:33:08.332246][2287486.981425943], took 0.42min 2025-12-04T11:33:08.3342000Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:33:08.3350004Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:33:08.3353029Z Running distributed/_composable/fsdp/test_fully_shard_overlap 1/1 ... [2025-12-04 11:33:08.335190][2287486.984374561] 2025-12-04T11:33:08.3353489Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:33:08.3355131Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_overlap.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:08.335374] 2025-12-04T11:33:18.6153574Z 2025-12-04T11:33:18.6155010Z distributed/_composable/fsdp/test_fully_shard_overlap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_overlap_1.1_f0dbe397233484d2_.log 2025-12-04T11:33:18.6157063Z Running 2 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_overlap.py::TestFullyShardOverlap::test_fully_shard_post_optim_event_overlap, test/distributed/_composable/fsdp/test_fully_shard_overlap.py::TestFullyShardOverlap::test_fully_shard_training_overlap 2025-12-04T11:33:18.6158452Z 2025-12-04T11:33:18.6158949Z Finished distributed/_composable/fsdp/test_fully_shard_overlap 1/1 ... [2025-12-04 11:33:18.615018][2287497.264198708], took 0.17min 2025-12-04T11:33:18.6160288Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:33:18.6178884Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:33:18.6181216Z Running distributed/test_run 1/1 ... [2025-12-04 11:33:18.618006][2287497.267189576] 2025-12-04T11:33:18.6181554Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:33:18.6183125Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_run.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:18.618191] 2025-12-04T11:33:20.8359751Z 2025-12-04T11:33:20.8360501Z distributed/test_run 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_run_1.1_21fea8d12c472afb_.log 2025-12-04T11:33:20.8361679Z Running 4 items in this shard: test/distributed/test_run.py::RunTest::test_config_from_args_signals_to_handle, test/distributed/test_run.py::RunTest::test_launch_agent_sets_environment_variable, test/distributed/test_run.py::RunTest::test_signals_to_handle_custom, test/distributed/test_run.py::RunTest::test_signals_to_handle_default 2025-12-04T11:33:20.8362470Z 2025-12-04T11:33:20.8362655Z Finished distributed/test_run 1/1 ... [2025-12-04 11:33:20.835746][2287499.484926406], took 0.04min 2025-12-04T11:33:20.8368432Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:33:20.8386851Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:33:20.8390716Z Running distributed/tensor/test_math_ops 1/1 ... [2025-12-04 11:33:20.838830][2287499.488014322] 2025-12-04T11:33:20.8391190Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:33:20.8391999Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_math_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:20.839012] 2025-12-04T11:35:49.3256312Z 2025-12-04T11:35:49.3257121Z distributed/tensor/test_math_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_math_ops_1.1_85a1a8506d37fc70_.log 2025-12-04T11:35:49.3269374Z Running 54 items in this shard: test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_conj_complex_dtensor, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_cumsum, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_add_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_norm_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_norm_partial, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_histc, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_layer_norm_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_layer_norm_bwd_req_grad, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_layer_norm_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_linalg_eigh, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_linear_op_reductions, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_logsumexp, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_matching_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_mean, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_nll_loss_and_cross_entropy, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_rotary_embedding_complex_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_shard0_svd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_shard_math_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_softmax_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_softmax_with_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_std, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_topk, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_upsampling, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_vector_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_vector_norm_partial, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_conj_complex_dtensor, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_cumsum, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_add_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_norm_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_norm_partial, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_histc, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_layer_norm_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_layer_norm_bwd_req_grad, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_layer_norm_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_linalg_eigh, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_linear_op_reductions, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_logsumexp, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_matching_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_mean, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_nll_loss_and_cross_entropy, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_rotary_embedding_complex_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_shard0_svd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_shard_math_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_softmax_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_softmax_with_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_std, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_topk, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_upsampling, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_vector_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_vector_norm_partial 2025-12-04T11:35:49.3277898Z 2025-12-04T11:35:49.3278060Z Finished distributed/tensor/test_math_ops 1/1 ... [2025-12-04 11:35:49.325416][2287647.974595905], took 2.47min 2025-12-04T11:35:49.3278616Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:35:49.3282917Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:35:49.3285632Z Running distributed/test_functional_api 1/1 ... [2025-12-04 11:35:49.328479][2287647.977662191] 2025-12-04T11:35:49.3285837Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:35:49.3287440Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_functional_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:35:49.328654] 2025-12-04T11:37:46.3629929Z 2025-12-04T11:37:46.3633899Z distributed/test_functional_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_functional_api_1.1_06d3bb52f6c4d2e0_.log 2025-12-04T11:37:46.3638902Z Running 11 items in this shard: test/distributed/test_functional_api.py::TestMetaCollectives::test_all_reduce, test/distributed/test_functional_api.py::TestMakeFx::test_all_reduce_tracing, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_gather_into_tensor_coalesced_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_1d_input_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_split_sizes_none_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_with_dce_code_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_with_fakepg_cuda, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_permute_tensor_with_sub_group_cuda, test/distributed/test_functional_api.py::TestFunctionalAutogradWithDistributedBackendCUDA::test_all_to_all_single_cuda 2025-12-04T11:37:46.3642530Z 2025-12-04T11:37:46.3643272Z Finished distributed/test_functional_api 1/1 ... [2025-12-04 11:37:46.362570][2287765.011750863], took 1.95min 2025-12-04T11:37:46.3643962Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:37:46.3655624Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:37:46.3656081Z Running distributed/_composable/fsdp/test_fully_shard_compile 1/1 ... [2025-12-04 11:37:46.365471][2287765.014655372] 2025-12-04T11:37:46.3656376Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:37:46.3658082Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_compile.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:37:46.365644] 2025-12-04T11:42:23.0934882Z 2025-12-04T11:42:23.0935894Z distributed/_composable/fsdp/test_fully_shard_compile 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_compile_1.1_5c36632b5155c6d2_.log 2025-12-04T11:42:23.0943049Z Running 18 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompileCompute::test_disable_compiling_hooks, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_compiled_autograd_ctx, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_dynamo_recompiles_on_fsdp_layers, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_dynamo_trace_use_training_state, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_aot_eager, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_aot_eager_decomp_partition, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_inductor_fullgraph_False, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_inductor_fullgraph_True, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_inductor_fullgraph_True_graph_partition, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_aot_eager, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_aot_eager_decomp_partition, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_inductor, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_trace_fsdp_copy_, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_aot_eager, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_aot_eager_decomp_partition, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_inductor_fullgraph_False, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_inductor_fullgraph_True, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_inductor_fullgraph_True_graph_partition 2025-12-04T11:42:23.0950230Z 2025-12-04T11:42:23.0950480Z Finished distributed/_composable/fsdp/test_fully_shard_compile 1/1 ... [2025-12-04 11:42:23.093268][2288041.742449257], took 4.61min 2025-12-04T11:42:23.0951117Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:42:23.0958508Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:42:23.0958816Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T11:42:23.0959060Z Uploading artifacts took 0.00 seconds 2025-12-04T11:42:23.0961417Z Running distributed/_composable/test_replicate 1/1 ... [2025-12-04 11:42:23.096002][2288041.745186339] 2025-12-04T11:42:23.0961978Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:42:23.0963179Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/test_replicate.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:42:23.096175] 2025-12-04T11:43:20.9981534Z 2025-12-04T11:43:20.9982740Z distributed/_composable/test_replicate 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_replicate_1.1_aa1eb9fb0e3bb004_.log 2025-12-04T11:43:20.9989265Z Running 17 items in this shard: test/distributed/_composable/test_replicate.py::ReplicateStateDictTest::test_replicate_non_root_multiple_save_load, test/distributed/_composable/test_replicate.py::ReplicateStateDictTest::test_replicate_single_module_save_load, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_device_id, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_ignore_module, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_move_args_kwargs_to_device, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_multi_module, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_single_module, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_with_kwargs, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_wrong_device_id_type, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_device_id, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_fully_shard_init, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_ignore_module, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_move_args_kwargs_to_device, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_multi_module, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_single_module, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_with_kwargs, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_wrong_device_id_type 2025-12-04T11:43:20.9994979Z 2025-12-04T11:43:20.9995254Z Finished distributed/_composable/test_replicate 1/1 ... [2025-12-04 11:43:20.997677][2288099.646856593], took 0.97min 2025-12-04T11:43:20.9996011Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:43:21.0010281Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:43:21.0011684Z Running distributed/checkpoint/test_pg_transport 1/1 ... [2025-12-04 11:43:21.001007][2288099.650191034] 2025-12-04T11:43:21.0011942Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:43:21.0013849Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_pg_transport.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:43:21.001184] 2025-12-04T11:43:31.1314417Z 2025-12-04T11:43:31.1316577Z distributed/checkpoint/test_pg_transport 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_pg_transport_1.1_a804652e5136a4d7_.log 2025-12-04T11:43:31.1323297Z Running 21 items in this shard: test/distributed/checkpoint/test_pg_transport.py::PgTransportCPU::test_pg_transport, test/distributed/checkpoint/test_pg_transport.py::PgTransportCPU::test_pg_transport_with_mixed_content, test/distributed/checkpoint/test_pg_transport.py::PgTransportCPU::test_pg_transport_with_sharded_tensor, test/distributed/checkpoint/test_pg_transport.py::PgTransportGPU::test_pg_transport, test/distributed/checkpoint/test_pg_transport.py::PgTransportGPU::test_pg_transport_with_mixed_content, test/distributed/checkpoint/test_pg_transport.py::PgTransportGPU::test_pg_transport_with_sharded_tensor, test/distributed/checkpoint/test_pg_transport.py::TestCastTensor::test_cast_tensor_different_dtypes, test/distributed/checkpoint/test_pg_transport.py::TestCastTensor::test_cast_tensor_with_offset, test/distributed/checkpoint/test_pg_transport.py::TestCastTensor::test_cast_tensor_with_stride, test/distributed/checkpoint/test_pg_transport.py::TestPrepareTensor::test_prepare_tensor_basic, test/distributed/checkpoint/test_pg_transport.py::TestPrepareTensor::test_prepare_tensor_different_shapes, test/distributed/checkpoint/test_pg_transport.py::TestPrepareTensor::test_prepare_tensor_with_stride, test/distributed/checkpoint/test_pg_transport.py::TestPrepareStateDict::test_prepare_state_dict_basic, test/distributed/checkpoint/test_pg_transport.py::TestPrepareStateDict::test_prepare_state_dict_nested, test/distributed/checkpoint/test_pg_transport.py::TestPrepareStateDict::test_prepare_state_dict_with_non_tensor_values, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportMocked::test_recv_checkpoint_basic, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportMocked::test_recv_checkpoint_with_state_dict_callback, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportMocked::test_send_checkpoint_basic, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportMocked::test_send_checkpoint_empty_state_dict, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportMocked::test_send_checkpoint_with_non_tensor_values, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportEdgeCases::test_send_checkpoint_with_cpu_tensors 2025-12-04T11:43:31.1328794Z 2025-12-04T11:43:31.1329024Z Finished distributed/checkpoint/test_pg_transport 1/1 ... [2025-12-04 11:43:31.131177][2288109.780356926], took 0.17min 2025-12-04T11:43:31.1329722Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:43:31.1343761Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:43:31.1346857Z Running distributed/_composable/fsdp/test_fully_shard_mixed_precision 1/1 ... [2025-12-04 11:43:31.134501][2288109.783685228] 2025-12-04T11:43:31.1347454Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:43:31.1348615Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_mixed_precision.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:43:31.134681] 2025-12-04T11:44:23.9311851Z 2025-12-04T11:44:23.9317186Z distributed/_composable/fsdp/test_fully_shard_mixed_precision 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_mixed_precision_1.1_dab913226be0626b_.log 2025-12-04T11:44:23.9323792Z Running 9 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionTraining::test_compute_dtype, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionTraining::test_grad_acc_with_reduce_dtype, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionTraining::test_reduce_dtype, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_clamp_reduce_dtype, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_dataclass_input, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_float16_on_one_submodule, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_norm_modules_bf16, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_norm_modules_fp16, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_submodules_with_external_inputs 2025-12-04T11:44:23.9327992Z 2025-12-04T11:44:23.9328455Z Finished distributed/_composable/fsdp/test_fully_shard_mixed_precision 1/1 ... [2025-12-04 11:44:23.930730][2288162.579909763], took 0.88min 2025-12-04T11:44:23.9329539Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:44:23.9339269Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:44:23.9341658Z Running distributed/checkpoint/test_utils 1/1 ... [2025-12-04 11:44:23.934013][2288162.583197205] 2025-12-04T11:44:23.9341939Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:44:23.9344907Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:44:23.934196] 2025-12-04T11:44:52.2893333Z 2025-12-04T11:44:52.2894248Z distributed/checkpoint/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_utils_1.1_8e3cc81d9cc30468_.log 2025-12-04T11:44:52.2899079Z Running 16 items in this shard: test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_dcp_logger, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_flat_data, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_index_hint_ignored_on_equals, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_index_hint_ignored_on_hash, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_init_convert_offset, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_sharded_tensor_lookup, test/distributed/checkpoint/test_utils.py::TestReaderView::testAllRead, test/distributed/checkpoint/test_utils.py::TestReaderView::testLongRead, test/distributed/checkpoint/test_utils.py::TestReaderView::testLongReadinto, test/distributed/checkpoint/test_utils.py::TestReaderView::testShortRead, test/distributed/checkpoint/test_utils.py::TestReaderView::testShortReadinto, test/distributed/checkpoint/test_utils.py::TestDistWrapper::test_barrier, test/distributed/checkpoint/test_utils.py::TestDistWrapper::test_broadcast_object_global_local_mismatch, test/distributed/checkpoint/test_utils.py::TestDistWrapper::test_broadcast_object_with_nonzero_coordinator, test/distributed/checkpoint/test_utils.py::TestDistWrapper::test_gather_object, test/distributed/checkpoint/test_utils.py::TestDistWrapper::test_scatter_object 2025-12-04T11:44:52.2902680Z 2025-12-04T11:44:52.2902906Z Finished distributed/checkpoint/test_utils 1/1 ... [2025-12-04 11:44:52.289121][2288190.938301014], took 0.47min 2025-12-04T11:44:52.2905971Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:44:52.2922645Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:44:52.2925376Z Running distributed/checkpoint/_experimental/test_checkpoint_process 1/1 ... [2025-12-04 11:44:52.292357][2288190.941541017] 2025-12-04T11:44:52.2925689Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:44:52.2926281Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_checkpoint_process.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:44:52.292526] 2025-12-04T11:45:11.4341542Z 2025-12-04T11:45:11.4342546Z distributed/checkpoint/_experimental/test_checkpoint_process 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_checkpoint_process_1.1_f38997afd754e436_.log 2025-12-04T11:45:11.4347595Z Running 15 items in this shard: test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestRequestTypes::test_request_type_enum, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestRequestTypes::test_worker_request, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestRequestTypes::test_worker_response, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcessConfig::test_custom_options, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcessConfig::test_default_options, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_checkpoint_process_initialization, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_checkpoint_write_future_state_dict, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_checkpoint_write_sync_state_dict, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_checkpoint_write_with_kwargs, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_communication_error_handling, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_forced_termination, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_graceful_termination, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_shared_memory_tensor_ipc, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_subprocess_initialization_failure, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_subprocess_initialization_timeout 2025-12-04T11:45:11.4352738Z 2025-12-04T11:45:11.4353002Z Finished distributed/checkpoint/_experimental/test_checkpoint_process 1/1 ... [2025-12-04 11:45:11.433900][2288210.083081142], took 0.32min 2025-12-04T11:45:11.4353731Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:45:11.4366279Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:45:11.4368672Z Running distributed/test_c10d_logger 1/1 ... [2025-12-04 11:45:11.436789][2288210.085972641] 2025-12-04T11:45:11.4368892Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:45:11.4370789Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_logger.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:45:11.436969] 2025-12-04T11:45:20.3148493Z 2025-12-04T11:45:20.3149363Z distributed/test_c10d_logger 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_logger_1.1_564604c60adf8385_.log 2025-12-04T11:45:20.3151338Z Running 2 items in this shard: test/distributed/test_c10d_logger.py::C10dErrorLoggerTest::test_exception_logger, test/distributed/test_c10d_logger.py::C10dErrorLoggerTest::test_get_or_create_logger 2025-12-04T11:45:20.3151985Z 2025-12-04T11:45:20.3152274Z Finished distributed/test_c10d_logger 1/1 ... [2025-12-04 11:45:20.314497][2288218.963677815], took 0.15min 2025-12-04T11:45:20.3156973Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:45:20.3174334Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:45:20.3178687Z Running distributed/_composable/test_replicate_training 1/1 ... [2025-12-04 11:45:20.317532][2288218.966715952] 2025-12-04T11:45:20.3179079Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:45:20.3179832Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/test_replicate_training.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:45:20.317711] 2025-12-04T11:47:31.1177505Z 2025-12-04T11:47:31.1178718Z distributed/_composable/test_replicate_training 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_replicate_training_1.1_f26ae4680a21c31a_.log 2025-12-04T11:47:31.1186537Z Running 17 items in this shard: test/distributed/_composable/test_replicate_training.py::TestReplicateForwardInputs::test_root_move_forward_input_to_device, test/distributed/_composable/test_replicate_training.py::TestReplicateRegisteredParams::test_param_registration_after_backward, test/distributed/_composable/test_replicate_training.py::TestReplicateRegisteredParams::test_param_registration_after_forward, test/distributed/_composable/test_replicate_training.py::TestReplicateCastAfterInit::test_to_float64_after_init, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_explicit_prefetching, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_multi_forward_module, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_non_root_forward_backward, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_post_optim_event, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_train_parity_multi_group_cpu_offload_eager, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_train_parity_multi_groups, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_train_parity_single_group, test/distributed/_composable/test_replicate_training.py::TestReplicateTrainingCompose::test_train_parity_with_activation_checkpointing, test/distributed/_composable/test_replicate_training.py::TestReplicateSharedParams::test_train_parity_with_shared_params, test/distributed/_composable/test_replicate_training.py::TestReplicateGradientAccumulation::test_1f1b_microbatching, test/distributed/_composable/test_replicate_training.py::TestReplicateGradientAccumulation::test_gradient_accumulation, test/distributed/_composable/test_replicate_training.py::TestReplicateCustomForwardMethod::test_register_fsdp_forward_method, test/distributed/_composable/test_replicate_training.py::TestReplicateTPTraining::test_replicate_tp 2025-12-04T11:47:31.1191380Z 2025-12-04T11:47:31.1191617Z Finished distributed/_composable/test_replicate_training 1/1 ... [2025-12-04 11:47:31.117438][2288349.766617311], took 2.18min 2025-12-04T11:47:31.1192326Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:47:31.1207621Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:47:31.1211631Z Running distributed/optim/test_apply_optimizer_in_backward 1/1 ... [2025-12-04 11:47:31.120941][2288349.770125278] 2025-12-04T11:47:31.1212734Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:47:31.1213649Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/optim/test_apply_optimizer_in_backward.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:47:31.121115] 2025-12-04T11:47:32.3532358Z 2025-12-04T11:47:32.3533260Z distributed/optim/test_apply_optimizer_in_backward 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.optim.test_apply_optimizer_in_backward_1.1_2e4a72c6e91ee59d_.log 2025-12-04T11:47:32.3533828Z 2025-12-04T11:47:32.3534102Z Finished distributed/optim/test_apply_optimizer_in_backward 1/1 ... [2025-12-04 11:47:32.352920][2288351.002102835], took 0.02min 2025-12-04T11:47:32.3541141Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:47:32.3558799Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:47:32.3560667Z Running distributed/fsdp/test_fsdp_uneven 1/1 ... [2025-12-04 11:47:32.355914][2288351.005097361] 2025-12-04T11:47:32.3560971Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:47:32.3562798Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_uneven.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:47:32.356049] 2025-12-04T11:48:05.4641859Z 2025-12-04T11:48:05.4642860Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_uneven 1/1 (test/test-reports/distributed.fsdp.test_fsdp_uneven_1.1_73d54334789787ed_.log) 2025-12-04T11:48:05.4644249Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-0cec19ea9b3dfbff.xml 2025-12-04T11:48:05.4645172Z ============================= test session starts ============================== 2025-12-04T11:48:05.4645802Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:48:05.4646359Z cachedir: .pytest_cache 2025-12-04T11:48:05.4647004Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:48:05.4647678Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:48:05.4648001Z configfile: pytest.ini 2025-12-04T11:48:05.4648712Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:48:05.4649385Z collecting ... collected 1 item 2025-12-04T11:48:05.4649765Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T11:48:05.4650350Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda 2025-12-04T11:48:05.4650545Z 2025-12-04T11:48:05.4650839Z distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda I1204 11:47:34.036000 335417 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 335486 2025-12-04T11:48:05.4651320Z I1204 11:47:34.037000 335417 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 335487 2025-12-04T11:48:05.4651762Z I1204 11:47:34.037000 335417 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 335488 2025-12-04T11:48:05.4652107Z I1204 11:47:34.038000 335417 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 335489 2025-12-04T11:48:05.4652446Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4653356Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4653857Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4654382Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4654889Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4655346Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4655792Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4656262Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4656730Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4657307Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4657772Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4658279Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4658735Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4659207Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4659889Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704. 2025-12-04T11:48:05.4660507Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4660865Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4661434Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4661919Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4662287Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4662703Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:48:05.4662946Z dist init r=1, world=4 2025-12-04T11:48:05.4663191Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4663528Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4664020Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4664498Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4664973Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4665421Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4665859Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4666361Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4666822Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4667282Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4667750Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4668236Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4668690Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4669153Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4669792Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2243952640 and is now 3240099840. 2025-12-04T11:48:05.4670390Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4670737Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4671299Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4671778Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4672187Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4672601Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:48:05.4672843Z dist init r=3, world=4 2025-12-04T11:48:05.4673046Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4673383Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4673867Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4674346Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4674823Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4675268Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4675738Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4676202Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4676663Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4677130Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4677590Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4678040Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4678530Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4678996Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4679630Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3290431488. 2025-12-04T11:48:05.4680227Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4680575Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4681134Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4681652Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4682016Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4682428Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:48:05.4682669Z dist init r=2, world=4 2025-12-04T11:48:05.4682869Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4683203Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4683691Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4684168Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4684648Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4685146Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4685583Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4686047Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4686508Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4686970Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4687432Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4687881Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4688376Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4688841Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4689474Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3449815040. 2025-12-04T11:48:05.4690073Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4690420Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4691009Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4691485Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4691849Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4692260Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:48:05.4692501Z dist init r=0, world=4 2025-12-04T11:48:05.4692916Z [rank0]:[W1204 11:47:40.176653701 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:48:05.4693327Z FAILED [7.9118s] [100%] 2025-12-04T11:48:05.4693393Z 2025-12-04T11:48:05.4693453Z =================================== FAILURES =================================== 2025-12-04T11:48:05.4693673Z _______________ TestUnevenParamShardCUDA.test_one_iteration_cuda _______________ 2025-12-04T11:48:05.4693849Z Traceback (most recent call last): 2025-12-04T11:48:05.4694097Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:48:05.4694341Z self._join_processes(fn) 2025-12-04T11:48:05.4694584Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:48:05.4694849Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:48:05.4695118Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:48:05.4695376Z raise RuntimeError(error) 2025-12-04T11:48:05.4695528Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:48:05.4695691Z Traceback (most recent call last): 2025-12-04T11:48:05.4695930Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4696169Z getattr(self, test_name)() 2025-12-04T11:48:05.4696399Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4696629Z fn() 2025-12-04T11:48:05.4696831Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4697060Z method(*args, **kwargs) 2025-12-04T11:48:05.4697281Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4697510Z method(*args, **kwargs) 2025-12-04T11:48:05.4697727Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4697951Z with policy(): 2025-12-04T11:48:05.4698213Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4698444Z raise RuntimeError(msg) 2025-12-04T11:48:05.4698833Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704. 2025-12-04T11:48:05.4699192Z 2025-12-04T11:48:05.4699266Z To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4699615Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4699857Z 2025-12-04T11:48:05.4699945Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4700072Z 2025-12-04T11:48:05.4700074Z 2025-12-04T11:48:05.4700155Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:48:05.4700359Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:48:05.4700726Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-0cec19ea9b3dfbff.xml - 2025-12-04T11:48:05.4701064Z =========================== short test summary info ============================ 2025-12-04T11:48:05.4701388Z FAILED [7.9118s] distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:48:05.4701693Z Traceback (most recent call last): 2025-12-04T11:48:05.4701936Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4702179Z getattr(self, test_name)() 2025-12-04T11:48:05.4702409Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4702672Z fn() 2025-12-04T11:48:05.4702873Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4703108Z method(*args, **kwargs) 2025-12-04T11:48:05.4703328Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4703559Z method(*args, **kwargs) 2025-12-04T11:48:05.4703775Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4704004Z with policy(): 2025-12-04T11:48:05.4704217Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4704447Z raise RuntimeError(msg) 2025-12-04T11:48:05.4704842Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704. 2025-12-04T11:48:05.4705200Z 2025-12-04T11:48:05.4705275Z To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4705587Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4705825Z 2025-12-04T11:48:05.4705916Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4706104Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:48:05.4706262Z ============================== 1 failed in 7.92s =============================== 2025-12-04T11:48:05.4706393Z Got exit code 1 2025-12-04T11:48:05.4706490Z Retrying single test... 2025-12-04T11:48:05.4706755Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-39cddf3a330f88b6.xml 2025-12-04T11:48:05.4707043Z ============================= test session starts ============================== 2025-12-04T11:48:05.4707253Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:48:05.4707440Z cachedir: .pytest_cache 2025-12-04T11:48:05.4707663Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:48:05.4707902Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:48:05.4708020Z configfile: pytest.ini 2025-12-04T11:48:05.4708336Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:48:05.4708578Z collecting ... collected 1 item 2025-12-04T11:48:05.4708846Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda 2025-12-04T11:48:05.4709122Z Running 1 items in this shard 2025-12-04T11:48:05.4709193Z 2025-12-04T11:48:05.4709479Z distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda I1204 11:47:44.379000 335819 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 335888 2025-12-04T11:48:05.4709948Z I1204 11:47:44.380000 335819 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 335889 2025-12-04T11:48:05.4710293Z I1204 11:47:44.380000 335819 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 335890 2025-12-04T11:48:05.4710633Z I1204 11:47:44.381000 335819 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 335891 2025-12-04T11:48:05.4710962Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4711333Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4711824Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4712309Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4712786Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4713231Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4713673Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4714134Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4714594Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4715060Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4715524Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4715976Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4716430Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4716891Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4717568Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3290431488. 2025-12-04T11:48:05.4718211Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4718563Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4719129Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4719608Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4719974Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4720391Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:48:05.4720669Z dist init r=2, world=4 2025-12-04T11:48:05.4720874Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4721210Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4721704Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4722192Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4722674Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4732679Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4733142Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4733627Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4734105Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4734576Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4735053Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4735514Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4736052Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4736522Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4737165Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3449815040. 2025-12-04T11:48:05.4737769Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4738119Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4738733Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4739213Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4739615Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4740031Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:48:05.4740278Z dist init r=0, world=4 2025-12-04T11:48:05.4740488Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4740831Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4741320Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4741803Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4742279Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4742726Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4743170Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4743638Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4744101Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4744564Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4745025Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4745507Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4745962Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4746431Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4747067Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3240099840. 2025-12-04T11:48:05.4747662Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4748011Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4748619Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4749136Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4749502Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4749915Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:48:05.4750156Z dist init r=3, world=4 2025-12-04T11:48:05.4750363Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4750699Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4751185Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4751664Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4752144Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4752591Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4753029Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4753495Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4753960Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4754420Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4754913Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4755364Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4755820Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4756286Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4756922Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704. 2025-12-04T11:48:05.4757516Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4757863Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4758496Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4758973Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4759337Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4759751Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:48:05.4759995Z dist init r=1, world=4 2025-12-04T11:48:05.4760398Z [rank0]:[W1204 11:47:50.275869666 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:48:05.4760811Z FAILED [7.9132s] [100%] 2025-12-04T11:48:05.4760878Z 2025-12-04T11:48:05.4760939Z =================================== FAILURES =================================== 2025-12-04T11:48:05.4761130Z _______________ TestUnevenParamShardCUDA.test_one_iteration_cuda _______________ 2025-12-04T11:48:05.4761305Z Traceback (most recent call last): 2025-12-04T11:48:05.4761555Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:48:05.4761799Z self._join_processes(fn) 2025-12-04T11:48:05.4762045Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:48:05.4762313Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:48:05.4762582Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:48:05.4762843Z raise RuntimeError(error) 2025-12-04T11:48:05.4762995Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:48:05.4763157Z Traceback (most recent call last): 2025-12-04T11:48:05.4763398Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4763639Z getattr(self, test_name)() 2025-12-04T11:48:05.4763912Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4764146Z fn() 2025-12-04T11:48:05.4764348Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4764582Z method(*args, **kwargs) 2025-12-04T11:48:05.4764805Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4765036Z method(*args, **kwargs) 2025-12-04T11:48:05.4765253Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4765481Z with policy(): 2025-12-04T11:48:05.4765693Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4765923Z raise RuntimeError(msg) 2025-12-04T11:48:05.4766317Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3449815040. 2025-12-04T11:48:05.4766678Z 2025-12-04T11:48:05.4766796Z To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4767111Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4767351Z 2025-12-04T11:48:05.4767440Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4767566Z 2025-12-04T11:48:05.4767627Z Process 2 exited with error code 10 and exception: 2025-12-04T11:48:05.4767767Z Traceback (most recent call last): 2025-12-04T11:48:05.4768010Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4768289Z getattr(self, test_name)() 2025-12-04T11:48:05.4768521Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4768752Z fn() 2025-12-04T11:48:05.4768956Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4769187Z method(*args, **kwargs) 2025-12-04T11:48:05.4769405Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4769633Z method(*args, **kwargs) 2025-12-04T11:48:05.4769850Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4770074Z with policy(): 2025-12-04T11:48:05.4770287Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4770518Z raise RuntimeError(msg) 2025-12-04T11:48:05.4770908Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3290431488. 2025-12-04T11:48:05.4771265Z 2025-12-04T11:48:05.4771341Z To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4771653Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4771889Z 2025-12-04T11:48:05.4771979Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4772102Z 2025-12-04T11:48:05.4772104Z 2025-12-04T11:48:05.4772185Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:48:05.4772423Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:48:05.4772790Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-39cddf3a330f88b6.xml - 2025-12-04T11:48:05.4773130Z =========================== short test summary info ============================ 2025-12-04T11:48:05.4773456Z FAILED [7.9132s] distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:48:05.4773760Z Traceback (most recent call last): 2025-12-04T11:48:05.4774005Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4774250Z getattr(self, test_name)() 2025-12-04T11:48:05.4774482Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4774718Z fn() 2025-12-04T11:48:05.4774920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4775153Z method(*args, **kwargs) 2025-12-04T11:48:05.4775371Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4775632Z method(*args, **kwargs) 2025-12-04T11:48:05.4775849Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4776074Z with policy(): 2025-12-04T11:48:05.4776284Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4776513Z raise RuntimeError(msg) 2025-12-04T11:48:05.4776908Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3449815040. 2025-12-04T11:48:05.4777264Z 2025-12-04T11:48:05.4777338Z To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4777647Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4777887Z 2025-12-04T11:48:05.4777973Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4778098Z 2025-12-04T11:48:05.4778192Z Process 2 exited with error code 10 and exception: 2025-12-04T11:48:05.4778333Z Traceback (most recent call last): 2025-12-04T11:48:05.4778576Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4778817Z getattr(self, test_name)() 2025-12-04T11:48:05.4779049Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4779280Z fn() 2025-12-04T11:48:05.4779480Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4779711Z method(*args, **kwargs) 2025-12-04T11:48:05.4779930Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4780158Z method(*args, **kwargs) 2025-12-04T11:48:05.4780375Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4780600Z with policy(): 2025-12-04T11:48:05.4780810Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4781041Z raise RuntimeError(msg) 2025-12-04T11:48:05.4781463Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3290431488. 2025-12-04T11:48:05.4781817Z 2025-12-04T11:48:05.4781893Z To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4782200Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4782437Z 2025-12-04T11:48:05.4782523Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4782711Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:48:05.4782871Z ============================== 1 failed in 7.92s =============================== 2025-12-04T11:48:05.4783003Z Got exit code 1 2025-12-04T11:48:05.4783101Z Retrying single test... 2025-12-04T11:48:05.4783366Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-6a36a936c5172dbc.xml 2025-12-04T11:48:05.4783657Z ============================= test session starts ============================== 2025-12-04T11:48:05.4783908Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:48:05.4784101Z cachedir: .pytest_cache 2025-12-04T11:48:05.4784328Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:48:05.4784570Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:48:05.4784693Z configfile: pytest.ini 2025-12-04T11:48:05.4784924Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:48:05.4785170Z collecting ... collected 1 item 2025-12-04T11:48:05.4785444Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda 2025-12-04T11:48:05.4785720Z Running 1 items in this shard 2025-12-04T11:48:05.4785792Z 2025-12-04T11:48:05.4786081Z distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda I1204 11:47:54.603000 336221 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 336290 2025-12-04T11:48:05.4786556Z I1204 11:47:54.604000 336221 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 336291 2025-12-04T11:48:05.4786900Z I1204 11:47:54.604000 336221 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 336292 2025-12-04T11:48:05.4787240Z I1204 11:47:54.605000 336221 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 336293 2025-12-04T11:48:05.4787573Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4787913Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4788459Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4788947Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4789428Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4789881Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4790445Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4790918Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4791386Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4791851Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4792318Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4792772Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4793228Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4793727Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4794372Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3290431488. 2025-12-04T11:48:05.4794973Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4795326Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4795893Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4796375Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4796746Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4797168Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:48:05.4797414Z dist init r=2, world=4 2025-12-04T11:48:05.4797621Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4797963Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4798484Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4798965Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4799489Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4799944Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4800389Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4800852Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4801316Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4801780Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4802244Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4802729Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4803182Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4803651Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4804294Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704. 2025-12-04T11:48:05.4804893Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4805243Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4805805Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4806284Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4806652Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4807066Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:48:05.4807310Z dist init r=1, world=4 2025-12-04T11:48:05.4807514Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4807851Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4808385Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4808896Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4809380Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4809832Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4810272Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4810739Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4811208Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4811673Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4812169Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4812623Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4813080Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4813546Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4814187Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2243952640 and is now 3240099840. 2025-12-04T11:48:05.4814784Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4815135Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4815701Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4816185Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4816555Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4816972Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:48:05.4817214Z dist init r=3, world=4 2025-12-04T11:48:05.4817417Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:48:05.4817757Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:48:05.4818310Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4818795Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:48:05.4819277Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4819731Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:48:05.4820174Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4820643Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4821109Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4821601Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:48:05.4822064Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4822520Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:48:05.4822976Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4823442Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:48:05.4824086Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3449815040. 2025-12-04T11:48:05.4824682Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4825034Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4825595Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4826074Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:48:05.4826439Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4826855Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:48:05.4827098Z dist init r=0, world=4 2025-12-04T11:48:05.4827523Z [rank0]:[W1204 11:48:01.665824755 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:48:05.4827940Z FAILED [8.0123s] [100%] 2025-12-04T11:48:05.4828010Z 2025-12-04T11:48:05.4828067Z =================================== FAILURES =================================== 2025-12-04T11:48:05.4828296Z _______________ TestUnevenParamShardCUDA.test_one_iteration_cuda _______________ 2025-12-04T11:48:05.4828475Z Traceback (most recent call last): 2025-12-04T11:48:05.4828725Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:48:05.4828972Z self._join_processes(fn) 2025-12-04T11:48:05.4829223Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:48:05.4829492Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:48:05.4829762Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:48:05.4830024Z raise RuntimeError(error) 2025-12-04T11:48:05.4830211Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:48:05.4830375Z Traceback (most recent call last): 2025-12-04T11:48:05.4830619Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4830864Z getattr(self, test_name)() 2025-12-04T11:48:05.4831097Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4831333Z fn() 2025-12-04T11:48:05.4831536Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4831773Z method(*args, **kwargs) 2025-12-04T11:48:05.4831996Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4832228Z method(*args, **kwargs) 2025-12-04T11:48:05.4832450Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4832678Z with policy(): 2025-12-04T11:48:05.4832892Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4833125Z raise RuntimeError(msg) 2025-12-04T11:48:05.4833517Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704. 2025-12-04T11:48:05.4833872Z 2025-12-04T11:48:05.4833949Z To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4834264Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4834505Z 2025-12-04T11:48:05.4834597Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4834725Z 2025-12-04T11:48:05.4834727Z 2025-12-04T11:48:05.4834805Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:48:05.4835010Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:48:05.4835379Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-6a36a936c5172dbc.xml - 2025-12-04T11:48:05.4835717Z =========================== short test summary info ============================ 2025-12-04T11:48:05.4836077Z FAILED [8.0123s] distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:48:05.4836383Z Traceback (most recent call last): 2025-12-04T11:48:05.4836631Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:48:05.4836878Z getattr(self, test_name)() 2025-12-04T11:48:05.4837113Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:48:05.4837348Z fn() 2025-12-04T11:48:05.4837551Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4837782Z method(*args, **kwargs) 2025-12-04T11:48:05.4838004Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:48:05.4838278Z method(*args, **kwargs) 2025-12-04T11:48:05.4838500Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:48:05.4838732Z with policy(): 2025-12-04T11:48:05.4838945Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:48:05.4839220Z raise RuntimeError(msg) 2025-12-04T11:48:05.4839612Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704. 2025-12-04T11:48:05.4839965Z 2025-12-04T11:48:05.4840042Z To execute this test, run the following from the base repo dir: 2025-12-04T11:48:05.4840358Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda 2025-12-04T11:48:05.4840595Z 2025-12-04T11:48:05.4840687Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:48:05.4840877Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:48:05.4841039Z ============================== 1 failed in 8.02s =============================== 2025-12-04T11:48:05.4841174Z Got exit code 1 2025-12-04T11:48:05.4841386Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda 2025-12-04T11:48:05.4841700Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:48:05.4842064Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-b3a2e625edae2d2f.xml 2025-12-04T11:48:05.4842358Z ============================= test session starts ============================== 2025-12-04T11:48:05.4842570Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:48:05.4842759Z cachedir: .pytest_cache 2025-12-04T11:48:05.4842980Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:48:05.4843221Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:48:05.4843337Z configfile: pytest.ini 2025-12-04T11:48:05.4843561Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:48:05.4843829Z collecting ... collected 1 item / 1 deselected / 0 selected 2025-12-04T11:48:05.4843988Z stepcurrent: skipping 1 already run items. 2025-12-04T11:48:05.4844120Z Running 0 items in this shard 2025-12-04T11:48:05.4844192Z 2025-12-04T11:48:05.4844433Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-b3a2e625edae2d2f.xml - 2025-12-04T11:48:05.4844806Z ============================ 1 deselected in 0.00s ============================= 2025-12-04T11:48:05.4845083Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda'] 2025-12-04T11:48:05.4845298Z 2025-12-04T11:48:05.4845488Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_uneven 1/1 (test/test-reports/distributed.fsdp.test_fsdp_uneven_1.1_73d54334789787ed_.log) 2025-12-04T11:48:05.4845712Z 2025-12-04T11:48:05.4845838Z Finished distributed/fsdp/test_fsdp_uneven 1/1 ... [2025-12-04 11:48:05.464076][2288384.113256447], took 0.55min 2025-12-04T11:48:05.4846260Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:48:05.4846652Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:48:05.4846873Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T11:48:05.4847053Z Uploading artifacts took 0.00 seconds 2025-12-04T11:48:05.4847189Z distributed/fsdp/test_fsdp_uneven 1/1 failed! 2025-12-04T11:48:05.4847391Z Running distributed/tensor/test_op_strategy 1/1 ... [2025-12-04 11:48:05.466924][2288384.11610755] 2025-12-04T11:48:05.4847617Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:48:05.4848018Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_op_strategy.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:05.467106] 2025-12-04T11:48:30.4684854Z 2025-12-04T11:48:30.4686004Z distributed/tensor/test_op_strategy 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_op_strategy_1.1_f65ccb2b3fdeb576_.log 2025-12-04T11:48:30.4694305Z Running 24 items in this shard: test/distributed/tensor/test_op_strategy.py::TestEinsumDims::test_batch_dims, test/distributed/tensor/test_op_strategy.py::TestEinsumDims::test_bmm_dims, test/distributed/tensor/test_op_strategy.py::TestEinsumDims::test_free_dims, test/distributed/tensor/test_op_strategy.py::TestEinsumDims::test_mm_dims, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_bmm_1d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_bmm_2d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_bmm_diffinndim_2d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_bmm_diffoutndim_2d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_linearity_1d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_mm_1d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_mm_2d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_pointwise_1d_mesh, test/distributed/tensor/test_op_strategy.py::TestCostModel::test_bmm_strategies, test/distributed/tensor/test_op_strategy.py::TestCostModel::test_mm_strategies, test/distributed/tensor/test_op_strategy.py::TestCostModel::test_redistribute_cost_latency, test/distributed/tensor/test_op_strategy.py::TestCostModel::test_redistribute_cost_mesh_1d, test/distributed/tensor/test_op_strategy.py::TestCostModel::test_redistribute_cost_mesh_2d, test/distributed/tensor/test_op_strategy.py::DistTensorReplicateStrategyRegistrationTest::test_replicate_strategy_placement, test/distributed/tensor/test_op_strategy.py::DistTensorReplicateStrategyRegistrationTest::test_tuple_replicate_strategy_placement, test/distributed/tensor/test_op_strategy.py::TestStrategyHashing::test_call_with_different_nontensor_args, test/distributed/tensor/test_op_strategy.py::TestStrategyOperation::test_cache_clean, test/distributed/tensor/test_op_strategy.py::DistTensorReplicateStrategyRegistrationTestWithLocalTensor::test_replicate_strategy_placement, test/distributed/tensor/test_op_strategy.py::DistTensorReplicateStrategyRegistrationTestWithLocalTensor::test_tuple_replicate_strategy_placement, test/distributed/tensor/test_op_strategy.py::TestStrategyHashingWithLocalTensor::test_call_with_different_nontensor_args 2025-12-04T11:48:30.4697707Z 2025-12-04T11:48:30.4697843Z Finished distributed/tensor/test_op_strategy 1/1 ... [2025-12-04 11:48:30.468065][2288409.117244593], took 0.42min 2025-12-04T11:48:30.4698308Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:48:30.4707633Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:48:30.4711209Z Running distributed/fsdp/test_fsdp_grad_acc 1/1 ... [2025-12-04 11:48:30.471001][2288409.120183193] 2025-12-04T11:48:30.4711419Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:48:30.4712967Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_grad_acc.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:30.471210] 2025-12-04T11:49:23.1167237Z 2025-12-04T11:49:23.1169396Z distributed/fsdp/test_fsdp_grad_acc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_grad_acc_1.1_6157c2e534b414ab_.log 2025-12-04T11:49:23.1175396Z Running 6 items in this shard: test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_configs0_use_orig_params_False, test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_configs0_use_orig_params_True, test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_configs1_use_orig_params_False, test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_configs1_use_orig_params_True, test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_cpu_offload_use_orig_params_False, test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_cpu_offload_use_orig_params_True 2025-12-04T11:49:23.1178882Z 2025-12-04T11:49:23.1179578Z Finished distributed/fsdp/test_fsdp_grad_acc 1/1 ... [2025-12-04 11:49:23.116467][2288461.765646529], took 0.88min 2025-12-04T11:49:23.1180691Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:49:23.1198550Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:49:23.1201573Z Running distributed/checkpoint/test_state_dict_stager 1/1 ... [2025-12-04 11:49:23.120002][2288461.769185945] 2025-12-04T11:49:23.1201883Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:49:23.1203028Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_state_dict_stager.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:49:23.120185] 2025-12-04T11:49:48.6741757Z 2025-12-04T11:49:48.6747049Z distributed/checkpoint/test_state_dict_stager 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_state_dict_stager_1.1_18563662566f98e7_.log 2025-12-04T11:49:48.6753449Z Running 14 items in this shard: test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_caching, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_complex_storage_sharing, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_cpu_storage_independence, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_dataclasses, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_different_dtypes, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_empty_tensors, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_tensor_attrs, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_tensor_pinned_and_shared, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_views, test/distributed/checkpoint/test_state_dict_stager.py::TestDTensorStateDictStager::test_dtensor, test/distributed/checkpoint/test_state_dict_stager.py::TestReplicationStager::test_replication_basic, test/distributed/checkpoint/test_state_dict_stager.py::TestReplicationStager::test_replication_dtensors, test/distributed/checkpoint/test_state_dict_stager.py::TestReplicationStager::test_replication_persistence, test/distributed/checkpoint/test_state_dict_stager.py::TestReplicationStager::test_replication_sharded_tensors 2025-12-04T11:49:48.6757639Z 2025-12-04T11:49:48.6757940Z Finished distributed/checkpoint/test_state_dict_stager 1/1 ... [2025-12-04 11:49:48.673912][2288487.323090275], took 0.43min 2025-12-04T11:49:48.6758876Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:49:48.6773100Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:49:48.6775872Z Running distributed/fsdp/test_fsdp_freezing_weights 1/1 ... [2025-12-04 11:49:48.677483][2288487.326667078] 2025-12-04T11:49:48.6776155Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:49:48.6777807Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_freezing_weights.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:49:48.677660] 2025-12-04T11:53:50.3187145Z 2025-12-04T11:53:50.3188408Z distributed/fsdp/test_fsdp_freezing_weights 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_freezing_weights_1.1_ca4a55d16ff319d1_.log 2025-12-04T11:53:50.3205731Z Running 32 items in this shard: test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_True 2025-12-04T11:53:50.3217372Z 2025-12-04T11:53:50.3217522Z Finished distributed/fsdp/test_fsdp_freezing_weights 1/1 ... [2025-12-04 11:53:50.318573][2288728.967751208], took 4.03min 2025-12-04T11:53:50.3217984Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:53:50.3218463Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:53:50.3221583Z Running distributed/_composable/fsdp/test_fully_shard_init 1/1 ... [2025-12-04 11:53:50.322032][2288728.971215445] 2025-12-04T11:53:50.3221818Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:53:50.3223300Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_init.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:53:50.322216] 2025-12-04T11:54:03.0561568Z 2025-12-04T11:54:03.0562452Z distributed/_composable/fsdp/test_fully_shard_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_init_1.1_b06e8c3d530e8d5f_.log 2025-12-04T11:54:03.0574456Z Running 42 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceTensor::test_move_states_to_device_ignored_param_device, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceTensor::test_move_states_to_device_tensor, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceDTensor::test_move_states_to_device_dtensor_invalid, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceDTensor::test_move_states_to_device_dtensor_valid, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMeshArg::test_2d_mesh_without_mesh_dim_names, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMeshArg::test_invalid_mesh_ndim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_duplicate, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_nested, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_nested_fully_shard_and_replicate, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_single, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_nested_fully_shard, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_shared_params_and_buffers, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_duplicates, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_shared_params, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_raise_noncontiguous_parameter, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_raise_scalar_parameter, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_shard_tensor_parameters, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterDTensor::test_shard_dtensor_parameters, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_double_lazy_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_is_root, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_module_and_param_fqns, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_multi_module_root, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_reset_sharded_param_in_lazy_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_invalid_meta_device_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_meta_device_1d_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_meta_device_2d_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_rank0_broadcast_meta_device_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardProcessGroupInit::test_1d_process_group_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardProcessGroupInit::test_2d_process_group_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardHSDPBroadcast::test_hsdp_broadcast_across_replicas, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestHSDPWithCustomHook::test_custom_hook_custom_stream, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestHSDPWithCustomHook::test_custom_hsdp_all_reduce_hook, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_transformer_shard_dim_neg1, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_transformer_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_uneven_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_2d_transformer_shard_diff_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_invalid_shard_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardOldImport::test_old_import_training, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMixedDtypeParam::test_mixed_dtypes_no_grad_param 2025-12-04T11:54:03.0583346Z 2025-12-04T11:54:03.0583532Z Finished distributed/_composable/fsdp/test_fully_shard_init 1/1 ... [2025-12-04 11:54:03.055909][2288741.705087673], took 0.21min 2025-12-04T11:54:03.0584064Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:54:03.0592278Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:54:03.0595675Z Running distributed/fsdp/test_fsdp_exec_order 1/1 ... [2025-12-04 11:54:03.059427][2288741.708610368] 2025-12-04T11:54:03.0595885Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:54:03.0597357Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_exec_order.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:54:03.059613] 2025-12-04T11:58:24.4280998Z 2025-12-04T11:58:24.4281610Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_exec_order 1/1 (test/test-reports/distributed.fsdp.test_fsdp_exec_order_1.1_e994e873868c2dab_.log) 2025-12-04T11:58:24.4282610Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-208acf942d1af133.xml 2025-12-04T11:58:24.4282927Z ============================= test session starts ============================== 2025-12-04T11:58:24.4283157Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4283354Z cachedir: .pytest_cache 2025-12-04T11:58:24.4283583Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4283828Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4283961Z configfile: pytest.ini 2025-12-04T11:58:24.4284194Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4284449Z collecting ... collected 8 items 2025-12-04T11:58:24.4284612Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T11:58:24.4286237Z Running 8 items in this shard: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.4287841Z 2025-12-04T11:58:24.4288458Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda I1204 11:54:04.744000 351856 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 351925 2025-12-04T11:58:24.4288987Z I1204 11:54:04.744000 351856 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 351926 2025-12-04T11:58:24.4289500Z I1204 11:54:04.745000 351856 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 351927 2025-12-04T11:58:24.4289838Z I1204 11:54:04.746000 351856 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 351928 2025-12-04T11:58:24.4290529Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4291157Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4291748Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4292331Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4292912Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4293544Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4294123Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4294707Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4294970Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4295372Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4295871Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4296352Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4296835Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4297290Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4297736Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4298262Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4298731Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4299235Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4299699Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4300161Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4300640Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4301121Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4301805Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016. 2025-12-04T11:58:24.4314816Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4315203Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4315849Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4316450Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4316845Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4317271Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4317522Z dist init r=0, world=4 2025-12-04T11:58:24.4317740Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4318086Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4318647Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4319135Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4319620Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4320074Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4320519Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4321070Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4321537Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4322006Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4322474Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4322928Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4323387Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4323860Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4324580Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:58:24.4325217Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4325573Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4326187Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4326720Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4327089Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4327507Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4327753Z dist init r=2, world=4 2025-12-04T11:58:24.4327961Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4328335Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4328823Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4329310Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4329791Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4330240Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4330732Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4331199Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4331666Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4332129Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4332597Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4333050Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4333507Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4334017Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4334691Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:58:24.4335330Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4335679Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4336286Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4336807Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4337174Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4337593Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4337838Z dist init r=1, world=4 2025-12-04T11:58:24.4338043Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4338435Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4338924Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4339404Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4339921Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4340377Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4340820Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4341286Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4341751Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4342217Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4342683Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4343183Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4343642Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4344110Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4344786Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:58:24.4345418Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4345769Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4346380Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4346903Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4347271Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4347687Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4347933Z dist init r=3, world=4 2025-12-04T11:58:24.4348039Z FAILED [6.1123s] [ 12%] 2025-12-04T11:58:24.4348105Z 2025-12-04T11:58:24.4348203Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4348411Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda __ 2025-12-04T11:58:24.4348602Z Traceback (most recent call last): 2025-12-04T11:58:24.4348853Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4349145Z self._join_processes(fn) 2025-12-04T11:58:24.4349395Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4349664Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4349937Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4350203Z raise RuntimeError(error) 2025-12-04T11:58:24.4350363Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4350531Z Traceback (most recent call last): 2025-12-04T11:58:24.4350776Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4351022Z getattr(self, test_name)() 2025-12-04T11:58:24.4351259Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4351498Z fn() 2025-12-04T11:58:24.4351704Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4351994Z method(*args, **kwargs) 2025-12-04T11:58:24.4352218Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4352451Z method(*args, **kwargs) 2025-12-04T11:58:24.4352672Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4352902Z with policy(): 2025-12-04T11:58:24.4353118Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4353351Z raise RuntimeError(msg) 2025-12-04T11:58:24.4353783Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016. 2025-12-04T11:58:24.4354176Z 2025-12-04T11:58:24.4354259Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4354622Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4354902Z 2025-12-04T11:58:24.4354997Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4355126Z 2025-12-04T11:58:24.4355128Z 2025-12-04T11:58:24.4355210Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4355417Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4355802Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-208acf942d1af133.xml - 2025-12-04T11:58:24.4356153Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4356523Z FAILED [6.1123s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4356865Z Traceback (most recent call last): 2025-12-04T11:58:24.4357115Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4357362Z getattr(self, test_name)() 2025-12-04T11:58:24.4357598Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4357835Z fn() 2025-12-04T11:58:24.4358071Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4358347Z method(*args, **kwargs) 2025-12-04T11:58:24.4358568Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4358802Z method(*args, **kwargs) 2025-12-04T11:58:24.4359023Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4359250Z with policy(): 2025-12-04T11:58:24.4359465Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4359699Z raise RuntimeError(msg) 2025-12-04T11:58:24.4360132Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016. 2025-12-04T11:58:24.4360528Z 2025-12-04T11:58:24.4360604Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4360964Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4361296Z 2025-12-04T11:58:24.4361390Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4361581Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4361742Z ============================== 1 failed in 6.12s =============================== 2025-12-04T11:58:24.4361873Z Got exit code 1 2025-12-04T11:58:24.4361974Z Retrying single test... 2025-12-04T11:58:24.4362254Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-22d3b0b8730091a0.xml 2025-12-04T11:58:24.4362558Z ============================= test session starts ============================== 2025-12-04T11:58:24.4362774Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4362967Z cachedir: .pytest_cache 2025-12-04T11:58:24.4363194Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4363435Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4363559Z configfile: pytest.ini 2025-12-04T11:58:24.4363792Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4364068Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.4364421Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4364743Z Running 1 items in this shard 2025-12-04T11:58:24.4364817Z 2025-12-04T11:58:24.4365144Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda I1204 11:54:13.436000 352234 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 352303 2025-12-04T11:58:24.4365663Z I1204 11:54:13.437000 352234 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 352304 2025-12-04T11:58:24.4366010Z I1204 11:54:13.438000 352234 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 352305 2025-12-04T11:58:24.4366353Z I1204 11:54:13.439000 352234 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 352306 2025-12-04T11:58:24.4367093Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4367688Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4368326Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4368914Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4369506Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4370125Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4370709Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4371291Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4371531Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4371876Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4372366Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4372849Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4373331Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4373780Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4374226Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4374693Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4375158Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4375619Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4376117Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4376570Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4377025Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4377489Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4378218Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2243952640 and is now 3005218816. 2025-12-04T11:58:24.4378855Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4379204Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4379847Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4380371Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4380735Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4381152Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4381396Z dist init r=3, world=4 2025-12-04T11:58:24.4381601Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4381940Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4382425Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4382905Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4383385Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4383832Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4384272Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4384736Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4385197Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4385697Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4386159Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4386610Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4387062Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4387530Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4388250Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:58:24.4388915Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4389262Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4389864Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4390386Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4390748Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4391163Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4391403Z dist init r=1, world=4 2025-12-04T11:58:24.4391604Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4391940Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4392427Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4392909Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4393388Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4393838Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4394276Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4394773Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4395235Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4395700Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4396160Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4396609Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4397062Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4397527Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4398248Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:58:24.4398910Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4399258Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4399866Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4400387Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4400750Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4401163Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4401404Z dist init r=2, world=4 2025-12-04T11:58:24.4401606Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4401943Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4402428Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4402908Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4403385Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4403831Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4404302Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4404764Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4405227Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4405686Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4406147Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4406597Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4407055Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4407541Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4408260Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:58:24.4408890Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4409238Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4409845Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4410363Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4410725Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4411138Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4411378Z dist init r=0, world=4 2025-12-04T11:58:24.4411481Z FAILED [6.3109s] [100%] 2025-12-04T11:58:24.4411548Z 2025-12-04T11:58:24.4411606Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4411807Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda __ 2025-12-04T11:58:24.4411991Z Traceback (most recent call last): 2025-12-04T11:58:24.4412234Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4412474Z self._join_processes(fn) 2025-12-04T11:58:24.4412718Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4412979Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4413278Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4413540Z raise RuntimeError(error) 2025-12-04T11:58:24.4413691Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:58:24.4413852Z Traceback (most recent call last): 2025-12-04T11:58:24.4414090Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4414331Z getattr(self, test_name)() 2025-12-04T11:58:24.4414559Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4414788Z fn() 2025-12-04T11:58:24.4414989Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4415221Z method(*args, **kwargs) 2025-12-04T11:58:24.4415441Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4415669Z method(*args, **kwargs) 2025-12-04T11:58:24.4415884Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4416144Z with policy(): 2025-12-04T11:58:24.4416352Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4416579Z raise RuntimeError(msg) 2025-12-04T11:58:24.4417020Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2243952640 and is now 3005218816. 2025-12-04T11:58:24.4417409Z 2025-12-04T11:58:24.4417484Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4417841Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4418121Z 2025-12-04T11:58:24.4418232Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4418357Z 2025-12-04T11:58:24.4418359Z 2025-12-04T11:58:24.4418437Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4418636Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4419008Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-22d3b0b8730091a0.xml - 2025-12-04T11:58:24.4419350Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4419708Z FAILED [6.3109s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:58:24.4420045Z Traceback (most recent call last): 2025-12-04T11:58:24.4420287Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4420528Z getattr(self, test_name)() 2025-12-04T11:58:24.4420759Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4420988Z fn() 2025-12-04T11:58:24.4421186Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4421413Z method(*args, **kwargs) 2025-12-04T11:58:24.4421627Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4421899Z method(*args, **kwargs) 2025-12-04T11:58:24.4422115Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4422336Z with policy(): 2025-12-04T11:58:24.4422546Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4422777Z raise RuntimeError(msg) 2025-12-04T11:58:24.4423203Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2243952640 and is now 3005218816. 2025-12-04T11:58:24.4423595Z 2025-12-04T11:58:24.4423672Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4424029Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4424310Z 2025-12-04T11:58:24.4424397Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4424585Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4424787Z ======================= 1 failed, 7 deselected in 6.32s ======================== 2025-12-04T11:58:24.4424923Z Got exit code 1 2025-12-04T11:58:24.4425018Z Retrying single test... 2025-12-04T11:58:24.4425295Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-caab704f2611f4a9.xml 2025-12-04T11:58:24.4425597Z ============================= test session starts ============================== 2025-12-04T11:58:24.4425809Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4425998Z cachedir: .pytest_cache 2025-12-04T11:58:24.4426225Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4426464Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4426583Z configfile: pytest.ini 2025-12-04T11:58:24.4426812Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4427079Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.4427421Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4427731Z Running 1 items in this shard 2025-12-04T11:58:24.4427805Z 2025-12-04T11:58:24.4428131Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda I1204 11:54:22.070000 352612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 352681 2025-12-04T11:58:24.4428684Z I1204 11:54:22.070000 352612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 352682 2025-12-04T11:58:24.4429027Z I1204 11:54:22.071000 352612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 352683 2025-12-04T11:58:24.4429365Z I1204 11:54:22.072000 352612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 352684 2025-12-04T11:58:24.4430052Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4430680Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4431263Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4431844Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4432424Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4433002Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4433585Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4434208Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4434448Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4434793Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4435287Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4435768Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4436249Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4436700Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4437139Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4437602Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4438065Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4438567Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4439029Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4439478Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4439971Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4440434Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4441105Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:58:24.4441738Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4442087Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4442695Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4443253Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4443617Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4444031Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4444273Z dist init r=1, world=4 2025-12-04T11:58:24.4444475Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4444812Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4445296Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4445776Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4446253Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4446698Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4447139Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4447601Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4448070Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4448575Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4449035Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4449524Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4449977Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4450442Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4451110Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:58:24.4451740Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4452087Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4452736Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4453256Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4453619Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4454032Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4454273Z dist init r=3, world=4 2025-12-04T11:58:24.4454475Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4454813Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4455299Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4455782Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4456261Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4456707Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4457146Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4457607Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4458070Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4458611Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4459071Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4459522Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4459974Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4460438Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4461204Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:58:24.4461873Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4462220Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4462828Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4463349Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4463713Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4464126Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4464366Z dist init r=2, world=4 2025-12-04T11:58:24.4464566Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4464901Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4465386Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4465865Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4466345Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4466793Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4467231Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4467691Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4468422Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4468883Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4469345Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4469798Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4470252Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4470714Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4471380Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:58:24.4472040Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4472394Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4473003Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4473525Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4473888Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4474301Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4474541Z dist init r=0, world=4 2025-12-04T11:58:24.4474644Z FAILED [6.2112s] [100%] 2025-12-04T11:58:24.4474709Z 2025-12-04T11:58:24.4474766Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4474967Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda __ 2025-12-04T11:58:24.4475154Z Traceback (most recent call last): 2025-12-04T11:58:24.4475399Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4475645Z self._join_processes(fn) 2025-12-04T11:58:24.4475892Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4476158Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4476426Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4476685Z raise RuntimeError(error) 2025-12-04T11:58:24.4476840Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.4477004Z Traceback (most recent call last): 2025-12-04T11:58:24.4477271Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4477514Z getattr(self, test_name)() 2025-12-04T11:58:24.4477748Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4477981Z fn() 2025-12-04T11:58:24.4478238Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4478469Z method(*args, **kwargs) 2025-12-04T11:58:24.4478691Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4478921Z method(*args, **kwargs) 2025-12-04T11:58:24.4479138Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4479363Z with policy(): 2025-12-04T11:58:24.4479578Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4479810Z raise RuntimeError(msg) 2025-12-04T11:58:24.4480252Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:58:24.4480682Z 2025-12-04T11:58:24.4480758Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4481115Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4481397Z 2025-12-04T11:58:24.4481488Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4481612Z 2025-12-04T11:58:24.4481616Z 2025-12-04T11:58:24.4481695Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4481897Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4482273Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-caab704f2611f4a9.xml - 2025-12-04T11:58:24.4482620Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4482981Z FAILED [6.2112s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.4483321Z Traceback (most recent call last): 2025-12-04T11:58:24.4483567Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4483812Z getattr(self, test_name)() 2025-12-04T11:58:24.4484045Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4484278Z fn() 2025-12-04T11:58:24.4484482Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4484712Z method(*args, **kwargs) 2025-12-04T11:58:24.4484930Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4485159Z method(*args, **kwargs) 2025-12-04T11:58:24.4485375Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4485599Z with policy(): 2025-12-04T11:58:24.4485811Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4486082Z raise RuntimeError(msg) 2025-12-04T11:58:24.4486508Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:58:24.4486900Z 2025-12-04T11:58:24.4486977Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4487334Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4487614Z 2025-12-04T11:58:24.4487702Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4487891Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4488058Z ======================= 1 failed, 7 deselected in 6.22s ======================== 2025-12-04T11:58:24.4488244Z Got exit code 1 2025-12-04T11:58:24.4488497Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:58:24.4488897Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:58:24.4489268Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-32a0f1d064cd3c3f.xml 2025-12-04T11:58:24.4489567Z ============================= test session starts ============================== 2025-12-04T11:58:24.4489777Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4489968Z cachedir: .pytest_cache 2025-12-04T11:58:24.4490195Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4490435Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4490555Z configfile: pytest.ini 2025-12-04T11:58:24.4490783Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4491056Z collecting ... collected 8 items / 1 deselected / 7 selected 2025-12-04T11:58:24.4491217Z stepcurrent: skipping 1 already run items. 2025-12-04T11:58:24.4491349Z Running 7 items in this shard 2025-12-04T11:58:24.4491424Z 2025-12-04T11:58:24.4491749Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda I1204 11:54:30.734000 352990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 353059 2025-12-04T11:58:24.4492259Z I1204 11:54:30.735000 352990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 353060 2025-12-04T11:58:24.4492605Z I1204 11:54:30.736000 352990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 353061 2025-12-04T11:58:24.4492944Z I1204 11:54:30.736000 352990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 353062 2025-12-04T11:58:24.4493636Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4494227Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4494851Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4495435Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4496017Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4496598Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4497176Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4497756Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4498021Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4498415Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4498906Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4499387Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4499870Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4500317Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4500755Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4501306Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4502045Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4502813Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4503540Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4504253Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4504962Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4505676Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4506762Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:58:24.4507766Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4508380Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4509345Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4509950Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4510477Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4511187Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4511576Z dist init r=0, world=4 2025-12-04T11:58:24.4511885Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4512426Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4513023Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4513596Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4514077Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4514588Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4515029Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4515494Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4515957Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4516419Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4516878Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4517329Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4517828Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4518344Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4519016Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:58:24.4519647Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4519998Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4520604Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4521163Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4521528Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4521942Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4522185Z dist init r=2, world=4 2025-12-04T11:58:24.4522389Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4522729Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4523215Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4523696Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4524173Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4524621Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4525058Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4525523Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4525984Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4526443Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4526957Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4527407Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4527863Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4528390Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4529063Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:58:24.4529693Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4530040Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4530680Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4531202Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4531569Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4531983Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4532225Z dist init r=1, world=4 2025-12-04T11:58:24.4532426Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4532766Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4533256Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4533734Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4534214Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4534662Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4535104Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4535567Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4536028Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4536522Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4536986Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4537441Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4537897Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4538401Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4539074Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:58:24.4539742Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4540089Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4540694Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4541212Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4541576Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4541990Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4542231Z dist init r=3, world=4 2025-12-04T11:58:24.4542333Z FAILED [6.2120s] [ 14%] 2025-12-04T11:58:24.4542399Z 2025-12-04T11:58:24.4542458Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4542657Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda __ 2025-12-04T11:58:24.4542844Z Traceback (most recent call last): 2025-12-04T11:58:24.4543093Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4543339Z self._join_processes(fn) 2025-12-04T11:58:24.4543586Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4543855Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4544125Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4544385Z raise RuntimeError(error) 2025-12-04T11:58:24.4544541Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4544705Z Traceback (most recent call last): 2025-12-04T11:58:24.4544945Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4545186Z getattr(self, test_name)() 2025-12-04T11:58:24.4545454Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4545687Z fn() 2025-12-04T11:58:24.4545890Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4546124Z method(*args, **kwargs) 2025-12-04T11:58:24.4546344Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4546573Z method(*args, **kwargs) 2025-12-04T11:58:24.4546791Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4547017Z with policy(): 2025-12-04T11:58:24.4547231Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4547463Z raise RuntimeError(msg) 2025-12-04T11:58:24.4547897Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:58:24.4548353Z 2025-12-04T11:58:24.4548431Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4548788Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4549067Z 2025-12-04T11:58:24.4549159Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4549283Z 2025-12-04T11:58:24.4549347Z Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.4549490Z Traceback (most recent call last): 2025-12-04T11:58:24.4549734Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4549979Z getattr(self, test_name)() 2025-12-04T11:58:24.4550211Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4550445Z fn() 2025-12-04T11:58:24.4550647Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4550876Z method(*args, **kwargs) 2025-12-04T11:58:24.4551094Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4551322Z method(*args, **kwargs) 2025-12-04T11:58:24.4551538Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4551762Z with policy(): 2025-12-04T11:58:24.4551975Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4552205Z raise RuntimeError(msg) 2025-12-04T11:58:24.4552628Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:58:24.4553018Z 2025-12-04T11:58:24.4553095Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4553451Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4553730Z 2025-12-04T11:58:24.4553821Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4553944Z 2025-12-04T11:58:24.4554044Z Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.4554187Z Traceback (most recent call last): 2025-12-04T11:58:24.4554429Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4554671Z getattr(self, test_name)() 2025-12-04T11:58:24.4554906Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4555138Z fn() 2025-12-04T11:58:24.4555339Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4555568Z method(*args, **kwargs) 2025-12-04T11:58:24.4555787Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4556015Z method(*args, **kwargs) 2025-12-04T11:58:24.4556236Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4556462Z with policy(): 2025-12-04T11:58:24.4556674Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4556952Z raise RuntimeError(msg) 2025-12-04T11:58:24.4557378Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:58:24.4557769Z 2025-12-04T11:58:24.4557844Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4558235Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4558518Z 2025-12-04T11:58:24.4558609Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4558735Z 2025-12-04T11:58:24.4558737Z 2025-12-04T11:58:24.4558815Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4559021Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4559397Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-32a0f1d064cd3c3f.xml - 2025-12-04T11:58:24.4559742Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4560104Z FAILED [6.2120s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4560444Z Traceback (most recent call last): 2025-12-04T11:58:24.4560691Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4560934Z getattr(self, test_name)() 2025-12-04T11:58:24.4561167Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4561403Z fn() 2025-12-04T11:58:24.4561604Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4561834Z method(*args, **kwargs) 2025-12-04T11:58:24.4562052Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4562281Z method(*args, **kwargs) 2025-12-04T11:58:24.4562499Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4562727Z with policy(): 2025-12-04T11:58:24.4562980Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4563214Z raise RuntimeError(msg) 2025-12-04T11:58:24.4563647Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:58:24.4570906Z 2025-12-04T11:58:24.4570998Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4571371Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4571660Z 2025-12-04T11:58:24.4571754Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4571887Z 2025-12-04T11:58:24.4571954Z Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.4572106Z Traceback (most recent call last): 2025-12-04T11:58:24.4572362Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4572674Z getattr(self, test_name)() 2025-12-04T11:58:24.4572910Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4573144Z fn() 2025-12-04T11:58:24.4573349Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4573582Z method(*args, **kwargs) 2025-12-04T11:58:24.4573803Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4574033Z method(*args, **kwargs) 2025-12-04T11:58:24.4574255Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4574482Z with policy(): 2025-12-04T11:58:24.4574694Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4574930Z raise RuntimeError(msg) 2025-12-04T11:58:24.4575363Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:58:24.4575763Z 2025-12-04T11:58:24.4575838Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4576196Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4576482Z 2025-12-04T11:58:24.4576571Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4576697Z 2025-12-04T11:58:24.4576762Z Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.4576905Z Traceback (most recent call last): 2025-12-04T11:58:24.4577154Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4577399Z getattr(self, test_name)() 2025-12-04T11:58:24.4577633Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4577868Z fn() 2025-12-04T11:58:24.4578069Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4578393Z method(*args, **kwargs) 2025-12-04T11:58:24.4578647Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4578880Z method(*args, **kwargs) 2025-12-04T11:58:24.4579095Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4579328Z with policy(): 2025-12-04T11:58:24.4579538Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4579770Z raise RuntimeError(msg) 2025-12-04T11:58:24.4580198Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:58:24.4580591Z 2025-12-04T11:58:24.4580669Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4581030Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4581309Z 2025-12-04T11:58:24.4581400Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4581627Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4581797Z ======================= 1 failed, 1 deselected in 6.22s ======================== 2025-12-04T11:58:24.4581941Z Got exit code 1 2025-12-04T11:58:24.4582044Z Retrying single test... 2025-12-04T11:58:24.4582319Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-064375b06a4c88cb.xml 2025-12-04T11:58:24.4582619Z ============================= test session starts ============================== 2025-12-04T11:58:24.4582837Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4583027Z cachedir: .pytest_cache 2025-12-04T11:58:24.4583254Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4583497Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4583620Z configfile: pytest.ini 2025-12-04T11:58:24.4583855Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4584128Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.4584476Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4584793Z Running 1 items in this shard 2025-12-04T11:58:24.4584866Z 2025-12-04T11:58:24.4585194Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda I1204 11:54:39.457000 353368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 353437 2025-12-04T11:58:24.4585712Z I1204 11:54:39.458000 353368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 353438 2025-12-04T11:58:24.4586061Z I1204 11:54:39.459000 353368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 353439 2025-12-04T11:58:24.4586404Z I1204 11:54:39.459000 353368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 353440 2025-12-04T11:58:24.4587118Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4587708Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4588325Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4588913Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4589500Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4590080Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4590662Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4591285Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4591529Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4591875Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4592369Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4592857Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4593341Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4593792Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4594234Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4594700Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4595167Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4595634Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4596101Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4596554Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4597047Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4597511Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4598231Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016. 2025-12-04T11:58:24.4598869Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4599225Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4599834Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4600403Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4600775Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4601191Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4601437Z dist init r=0, world=4 2025-12-04T11:58:24.4601644Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4601986Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4602473Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4602955Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4603434Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4603885Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4604326Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4604795Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4605258Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4605720Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4606221Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4606672Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4607130Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4607599Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4608317Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:58:24.4608952Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4609303Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4609953Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4610475Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4610844Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4611260Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4611503Z dist init r=3, world=4 2025-12-04T11:58:24.4611709Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4612052Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4612537Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4613019Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4613502Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4613949Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4614388Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4614851Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4615352Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4615817Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4616280Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4616736Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4617189Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4617656Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4618384Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:58:24.4619053Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4619404Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4620012Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4620533Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4620900Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4621328Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4621572Z dist init r=2, world=4 2025-12-04T11:58:24.4621775Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4622113Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4622602Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4623080Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4623560Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4624007Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4624448Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4624949Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4625415Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4625875Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4626338Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4626786Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4627243Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4627712Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4628448Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:58:24.4629077Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4629427Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4630033Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4630554Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4630920Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4631330Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4631568Z dist init r=1, world=4 2025-12-04T11:58:24.4631668Z FAILED [6.3131s] [100%] 2025-12-04T11:58:24.4631731Z 2025-12-04T11:58:24.4631790Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4631987Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda __ 2025-12-04T11:58:24.4632169Z Traceback (most recent call last): 2025-12-04T11:58:24.4632412Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4632652Z self._join_processes(fn) 2025-12-04T11:58:24.4632895Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4633156Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4633421Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4633680Z raise RuntimeError(error) 2025-12-04T11:58:24.4633878Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4634038Z Traceback (most recent call last): 2025-12-04T11:58:24.4634276Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4634517Z getattr(self, test_name)() 2025-12-04T11:58:24.4634750Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4634979Z fn() 2025-12-04T11:58:24.4635178Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4635406Z method(*args, **kwargs) 2025-12-04T11:58:24.4635624Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4635849Z method(*args, **kwargs) 2025-12-04T11:58:24.4636067Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4636291Z with policy(): 2025-12-04T11:58:24.4636502Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4636772Z raise RuntimeError(msg) 2025-12-04T11:58:24.4637196Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016. 2025-12-04T11:58:24.4637586Z 2025-12-04T11:58:24.4637660Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4638014Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4638337Z 2025-12-04T11:58:24.4638429Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4638553Z 2025-12-04T11:58:24.4638554Z 2025-12-04T11:58:24.4638633Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4638835Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4639208Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-064375b06a4c88cb.xml - 2025-12-04T11:58:24.4639551Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4639908Z FAILED [6.3131s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4640245Z Traceback (most recent call last): 2025-12-04T11:58:24.4640489Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4640729Z getattr(self, test_name)() 2025-12-04T11:58:24.4640959Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4641189Z fn() 2025-12-04T11:58:24.4641389Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4641615Z method(*args, **kwargs) 2025-12-04T11:58:24.4641830Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4642055Z method(*args, **kwargs) 2025-12-04T11:58:24.4642269Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4642490Z with policy(): 2025-12-04T11:58:24.4642734Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4642962Z raise RuntimeError(msg) 2025-12-04T11:58:24.4643386Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016. 2025-12-04T11:58:24.4643781Z 2025-12-04T11:58:24.4643856Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4644209Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4644487Z 2025-12-04T11:58:24.4644575Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4644763Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4644924Z ======================= 1 failed, 7 deselected in 6.32s ======================== 2025-12-04T11:58:24.4645059Z Got exit code 1 2025-12-04T11:58:24.4645154Z Retrying single test... 2025-12-04T11:58:24.4645461Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-86744d037db0ba9d.xml 2025-12-04T11:58:24.4645756Z ============================= test session starts ============================== 2025-12-04T11:58:24.4645965Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4646150Z cachedir: .pytest_cache 2025-12-04T11:58:24.4646373Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4646608Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4646725Z configfile: pytest.ini 2025-12-04T11:58:24.4646953Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4647218Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.4647560Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4647873Z Running 1 items in this shard 2025-12-04T11:58:24.4647945Z 2025-12-04T11:58:24.4648311Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda I1204 11:54:48.351000 353746 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 353815 2025-12-04T11:58:24.4648822Z I1204 11:54:48.352000 353746 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 353816 2025-12-04T11:58:24.4649164Z I1204 11:54:48.352000 353746 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 353817 2025-12-04T11:58:24.4649501Z I1204 11:54:48.353000 353746 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 353818 2025-12-04T11:58:24.4650188Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4650773Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4651397Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4651976Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4652554Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4653127Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4653702Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4654274Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4654545Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4654884Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4655371Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4655848Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4656330Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4656775Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4657213Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4657673Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4658134Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4658629Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4659090Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4659537Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4659988Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4660447Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4661160Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016. 2025-12-04T11:58:24.4661794Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4662141Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4662744Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4663262Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4663624Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4664078Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4664318Z dist init r=0, world=4 2025-12-04T11:58:24.4664517Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4664851Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4665334Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4665810Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4666288Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4666734Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4667169Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4667630Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4668088Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4668588Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4669045Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4669492Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4669983Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4670443Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4671115Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:58:24.4671741Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4672088Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4672690Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4673241Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4673600Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4674010Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4674247Z dist init r=2, world=4 2025-12-04T11:58:24.4674448Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4674782Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4675263Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4675742Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4676220Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4676666Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4677104Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4677565Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4678023Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4678525Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4679020Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4679467Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4679917Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4680380Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4681052Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:58:24.4681678Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4682023Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4682661Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4683178Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4683538Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4683949Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4684187Z dist init r=3, world=4 2025-12-04T11:58:24.4684385Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4684720Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4685201Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4685679Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4686155Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4686600Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4687039Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4687497Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4687956Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4688500Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4688960Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4689410Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4689860Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4690320Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4690989Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:58:24.4691648Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4691994Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4692594Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4693111Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4693471Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4693882Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4694119Z dist init r=1, world=4 2025-12-04T11:58:24.4694221Z FAILED [6.4126s] [100%] 2025-12-04T11:58:24.4694285Z 2025-12-04T11:58:24.4694342Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4694539Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda __ 2025-12-04T11:58:24.4694722Z Traceback (most recent call last): 2025-12-04T11:58:24.4694967Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4695208Z self._join_processes(fn) 2025-12-04T11:58:24.4695453Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4695721Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4695987Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4696244Z raise RuntimeError(error) 2025-12-04T11:58:24.4696397Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.4696558Z Traceback (most recent call last): 2025-12-04T11:58:24.4696797Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4697036Z getattr(self, test_name)() 2025-12-04T11:58:24.4697289Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4697327Z fn() 2025-12-04T11:58:24.4697479Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4697525Z method(*args, **kwargs) 2025-12-04T11:58:24.4697676Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4697718Z method(*args, **kwargs) 2025-12-04T11:58:24.4697868Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4697907Z with policy(): 2025-12-04T11:58:24.4698059Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4698102Z raise RuntimeError(msg) 2025-12-04T11:58:24.4698508Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:58:24.4698548Z 2025-12-04T11:58:24.4698624Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4698870Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4698872Z 2025-12-04T11:58:24.4698960Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4698963Z 2025-12-04T11:58:24.4698964Z 2025-12-04T11:58:24.4699042Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4699130Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4699384Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-86744d037db0ba9d.xml - 2025-12-04T11:58:24.4699447Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4699708Z FAILED [6.4126s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.4699756Z Traceback (most recent call last): 2025-12-04T11:58:24.4699921Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4699965Z getattr(self, test_name)() 2025-12-04T11:58:24.4700124Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4700160Z fn() 2025-12-04T11:58:24.4700311Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4700353Z method(*args, **kwargs) 2025-12-04T11:58:24.4700503Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4700546Z method(*args, **kwargs) 2025-12-04T11:58:24.4700695Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4700733Z with policy(): 2025-12-04T11:58:24.4700883Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4700925Z raise RuntimeError(msg) 2025-12-04T11:58:24.4701323Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:58:24.4701326Z 2025-12-04T11:58:24.4701402Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4701647Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4701649Z 2025-12-04T11:58:24.4701736Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4701800Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4701862Z ======================= 1 failed, 7 deselected in 6.42s ======================== 2025-12-04T11:58:24.4701901Z Got exit code 1 2025-12-04T11:58:24.4702096Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:58:24.4702225Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:58:24.4702433Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-4c027be4a8a991b6.xml 2025-12-04T11:58:24.4702517Z ============================= test session starts ============================== 2025-12-04T11:58:24.4702631Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4702674Z cachedir: .pytest_cache 2025-12-04T11:58:24.4702833Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4702880Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4702922Z configfile: pytest.ini 2025-12-04T11:58:24.4703088Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4703160Z collecting ... collected 8 items / 2 deselected / 6 selected 2025-12-04T11:58:24.4703217Z stepcurrent: skipping 2 already run items. 2025-12-04T11:58:24.4703261Z Running 6 items in this shard 2025-12-04T11:58:24.4703266Z 2025-12-04T11:58:24.4703625Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda I1204 11:54:57.439000 354124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 354193 2025-12-04T11:58:24.4703781Z I1204 11:54:57.440000 354124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 354194 2025-12-04T11:58:24.4703935Z I1204 11:54:57.441000 354124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 354195 2025-12-04T11:58:24.4704087Z I1204 11:54:57.441000 354124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 354196 2025-12-04T11:58:24.4704588Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4704654Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4705144Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4705227Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4705717Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4705777Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4706264Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4706323Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4706468Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4706632Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4706946Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4707103Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4707392Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4707520Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4707797Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4707947Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4708254Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4708402Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4708679Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4708815Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4709093Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4709242Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4709796Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4709913Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4710111Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4710527Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4710643Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4710858Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4711023Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4711102Z dist init r=0, world=4 2025-12-04T11:58:24.4711241Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4711401Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4711687Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4711843Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4712127Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4712253Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4712529Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4712677Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4712955Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4713103Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4713379Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4713516Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4713793Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4713967Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4714482Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4714599Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4714796Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4715207Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4715323Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4715560Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4715725Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4715864Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4716025Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4716312Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4716467Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4716751Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4716874Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4717153Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4717302Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4717581Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4717727Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4718000Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4718225Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4718503Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4718654Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4719167Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:58:24.4719284Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4719480Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4719888Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4720037Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4720247Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4720415Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4720454Z dist init r=2, world=4 2025-12-04T11:58:24.4720494Z dist init r=3, world=4 2025-12-04T11:58:24.4720632Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4720793Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4721079Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4721232Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4721520Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4721644Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4721925Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4722073Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4722350Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4722533Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4722807Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4722945Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4723222Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4723371Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4723883Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4724020Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4724218Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4724624Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4724741Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4724951Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4725117Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4725157Z dist init r=1, world=4 2025-12-04T11:58:24.4725509Z [rank0]:[W1204 11:55:04.319681436 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.4725551Z FAILED [8.8171s] [ 16%] 2025-12-04T11:58:24.4725553Z 2025-12-04T11:58:24.4725612Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4725750Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda _ 2025-12-04T11:58:24.4725797Z Traceback (most recent call last): 2025-12-04T11:58:24.4725962Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4726008Z self._join_processes(fn) 2025-12-04T11:58:24.4726182Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4726235Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4726417Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4726460Z raise RuntimeError(error) 2025-12-04T11:58:24.4726544Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4726611Z Traceback (most recent call last): 2025-12-04T11:58:24.4726774Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4726817Z getattr(self, test_name)() 2025-12-04T11:58:24.4726979Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4727014Z fn() 2025-12-04T11:58:24.4727167Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4727207Z method(*args, **kwargs) 2025-12-04T11:58:24.4727359Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4727400Z method(*args, **kwargs) 2025-12-04T11:58:24.4727550Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4727589Z with policy(): 2025-12-04T11:58:24.4727742Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4727785Z raise RuntimeError(msg) 2025-12-04T11:58:24.4728253Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4728256Z 2025-12-04T11:58:24.4728333Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4728614Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4728616Z 2025-12-04T11:58:24.4728707Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4728710Z 2025-12-04T11:58:24.4728711Z 2025-12-04T11:58:24.4728787Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4728878Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4729130Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-4c027be4a8a991b6.xml - 2025-12-04T11:58:24.4729191Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4729484Z FAILED [8.8171s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4729530Z Traceback (most recent call last): 2025-12-04T11:58:24.4729697Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4729739Z getattr(self, test_name)() 2025-12-04T11:58:24.4729900Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4729936Z fn() 2025-12-04T11:58:24.4730089Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4730129Z method(*args, **kwargs) 2025-12-04T11:58:24.4730280Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4730319Z method(*args, **kwargs) 2025-12-04T11:58:24.4730468Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4730505Z with policy(): 2025-12-04T11:58:24.4730690Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4730732Z raise RuntimeError(msg) 2025-12-04T11:58:24.4731125Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4731129Z 2025-12-04T11:58:24.4731205Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4731485Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4731487Z 2025-12-04T11:58:24.4731577Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4731639Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4731702Z ======================= 1 failed, 2 deselected in 8.83s ======================== 2025-12-04T11:58:24.4731777Z Got exit code 1 2025-12-04T11:58:24.4731819Z Retrying single test... 2025-12-04T11:58:24.4732025Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-ee549baed1036602.xml 2025-12-04T11:58:24.4732084Z ============================= test session starts ============================== 2025-12-04T11:58:24.4732196Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4732240Z cachedir: .pytest_cache 2025-12-04T11:58:24.4732400Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4732449Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4732491Z configfile: pytest.ini 2025-12-04T11:58:24.4732656Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4732730Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.4733005Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4733050Z Running 1 items in this shard 2025-12-04T11:58:24.4733052Z 2025-12-04T11:58:24.4733407Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda I1204 11:55:08.799000 354526 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 354595 2025-12-04T11:58:24.4733564Z I1204 11:55:08.800000 354526 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 354596 2025-12-04T11:58:24.4733716Z I1204 11:55:08.801000 354526 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 354597 2025-12-04T11:58:24.4733869Z I1204 11:55:08.801000 354526 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 354598 2025-12-04T11:58:24.4734366Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4734431Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4734974Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4735035Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4735521Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4735580Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4736068Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4736148Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4736291Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4736455Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4736744Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4736902Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4737189Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4737316Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4737593Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4737741Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4738021Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4738205Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4738482Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4738618Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4738897Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4739082Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4739599Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:58:24.4739717Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4739912Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4740322Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4740436Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4740688Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4740854Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4740893Z dist init r=3, world=4 2025-12-04T11:58:24.4741033Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4741193Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4741479Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4741635Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4741920Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4742045Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4742323Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4742472Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4742750Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4742898Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4743172Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4743329Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4743607Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4743758Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4744270Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4744386Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4744583Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4745010Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4745125Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4745336Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4745503Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4745544Z dist init r=2, world=4 2025-12-04T11:58:24.4745680Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4745842Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4746126Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4746280Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4746564Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4746689Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4746965Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4747113Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4747390Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4747567Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4747843Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4747980Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4748307Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4748455Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4748969Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4749238Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4749434Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4749844Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4749958Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4750171Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4750338Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4750377Z dist init r=1, world=4 2025-12-04T11:58:24.4750515Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4750675Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4750964Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4751117Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4751404Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4751527Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4751805Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4751988Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4752265Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4752416Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4752691Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4752828Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4753106Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4753256Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4753789Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4753903Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4754102Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4754516Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4754633Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4754844Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4755010Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4755051Z dist init r=0, world=4 2025-12-04T11:58:24.4755391Z [rank0]:[W1204 11:55:16.955971913 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.4755432Z FAILED [9.0134s] [100%] 2025-12-04T11:58:24.4755434Z 2025-12-04T11:58:24.4755493Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4755630Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda _ 2025-12-04T11:58:24.4755678Z Traceback (most recent call last): 2025-12-04T11:58:24.4755843Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4755888Z self._join_processes(fn) 2025-12-04T11:58:24.4756062Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4756139Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4756320Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4756364Z raise RuntimeError(error) 2025-12-04T11:58:24.4756449Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:58:24.4756495Z Traceback (most recent call last): 2025-12-04T11:58:24.4756659Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4756701Z getattr(self, test_name)() 2025-12-04T11:58:24.4756861Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4756896Z fn() 2025-12-04T11:58:24.4757048Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4757090Z method(*args, **kwargs) 2025-12-04T11:58:24.4757243Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4757285Z method(*args, **kwargs) 2025-12-04T11:58:24.4757435Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4757504Z with policy(): 2025-12-04T11:58:24.4757656Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4757699Z raise RuntimeError(msg) 2025-12-04T11:58:24.4758090Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:58:24.4758093Z 2025-12-04T11:58:24.4758210Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4758491Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4758494Z 2025-12-04T11:58:24.4758585Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4758587Z 2025-12-04T11:58:24.4758588Z 2025-12-04T11:58:24.4758665Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4758753Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4759007Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-ee549baed1036602.xml - 2025-12-04T11:58:24.4759070Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4759365Z FAILED [9.0134s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:58:24.4759412Z Traceback (most recent call last): 2025-12-04T11:58:24.4759577Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4759620Z getattr(self, test_name)() 2025-12-04T11:58:24.4759780Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4759815Z fn() 2025-12-04T11:58:24.4759967Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4760007Z method(*args, **kwargs) 2025-12-04T11:58:24.4760196Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4760237Z method(*args, **kwargs) 2025-12-04T11:58:24.4760387Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4760426Z with policy(): 2025-12-04T11:58:24.4760580Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4760622Z raise RuntimeError(msg) 2025-12-04T11:58:24.4761012Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:58:24.4761014Z 2025-12-04T11:58:24.4761092Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4761375Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4761407Z 2025-12-04T11:58:24.4761497Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4761560Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4761625Z ======================= 1 failed, 7 deselected in 9.02s ======================== 2025-12-04T11:58:24.4761662Z Got exit code 1 2025-12-04T11:58:24.4761704Z Retrying single test... 2025-12-04T11:58:24.4761910Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-09d71ce4d97b7d04.xml 2025-12-04T11:58:24.4761969Z ============================= test session starts ============================== 2025-12-04T11:58:24.4762083Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4762126Z cachedir: .pytest_cache 2025-12-04T11:58:24.4762284Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4762333Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4762376Z configfile: pytest.ini 2025-12-04T11:58:24.4762538Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4762612Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.4762884Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4762929Z Running 1 items in this shard 2025-12-04T11:58:24.4762931Z 2025-12-04T11:58:24.4763284Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda I1204 11:55:20.359000 354928 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 354997 2025-12-04T11:58:24.4763442Z I1204 11:55:20.360000 354928 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 354998 2025-12-04T11:58:24.4763595Z I1204 11:55:20.361000 354928 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 354999 2025-12-04T11:58:24.4763750Z I1204 11:55:20.361000 354928 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 355000 2025-12-04T11:58:24.4764270Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4764332Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4764826Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4764887Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4765374Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4765431Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4765938Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4765997Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4766141Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4766307Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4766597Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4766756Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4767041Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4767168Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4767448Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4767596Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4767876Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4768023Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4768348Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4768521Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4768800Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4768952Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4769469Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4769587Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4769783Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4770225Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4770341Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4770555Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4770722Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4770762Z dist init r=0, world=4 2025-12-04T11:58:24.4770901Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4771061Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4771349Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4771502Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4771791Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4771917Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4772194Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4772345Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4772620Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4772790Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4773066Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4773206Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4773483Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4773632Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4774147Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:58:24.4774284Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4774480Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4774885Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4775003Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4775217Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4775383Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4775424Z dist init r=3, world=4 2025-12-04T11:58:24.4775562Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4775722Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4776011Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4776167Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4776450Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4776576Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4776852Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4777017Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4777297Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4777447Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4777724Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4777860Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4778140Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4778342Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4778858Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4779011Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4779206Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4779614Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4779731Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4779943Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4780107Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4780146Z dist init r=2, world=4 2025-12-04T11:58:24.4780286Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4780445Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4780731Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4780887Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4781173Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4781296Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4781609Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4781760Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4782037Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4782186Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4782462Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4782600Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4782876Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4783047Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4783561Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4783675Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4783872Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4784282Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4784399Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4784612Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4784776Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4784818Z dist init r=1, world=4 2025-12-04T11:58:24.4785158Z [rank0]:[W1204 11:55:27.239354586 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.4785198Z FAILED [8.9149s] [100%] 2025-12-04T11:58:24.4785201Z 2025-12-04T11:58:24.4785258Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4785394Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda _ 2025-12-04T11:58:24.4785442Z Traceback (most recent call last): 2025-12-04T11:58:24.4785627Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4785672Z self._join_processes(fn) 2025-12-04T11:58:24.4785846Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4785902Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4786081Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4786127Z raise RuntimeError(error) 2025-12-04T11:58:24.4786209Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4786254Z Traceback (most recent call last): 2025-12-04T11:58:24.4786417Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4786460Z getattr(self, test_name)() 2025-12-04T11:58:24.4786619Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4786655Z fn() 2025-12-04T11:58:24.4786805Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4786868Z method(*args, **kwargs) 2025-12-04T11:58:24.4787018Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4787060Z method(*args, **kwargs) 2025-12-04T11:58:24.4787209Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4787247Z with policy(): 2025-12-04T11:58:24.4787397Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4787440Z raise RuntimeError(msg) 2025-12-04T11:58:24.4787829Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4787833Z 2025-12-04T11:58:24.4787910Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4788235Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4788239Z 2025-12-04T11:58:24.4788327Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4788330Z 2025-12-04T11:58:24.4788331Z 2025-12-04T11:58:24.4788407Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4788495Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4788746Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-09d71ce4d97b7d04.xml - 2025-12-04T11:58:24.4788809Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4789103Z FAILED [8.9149s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4789149Z Traceback (most recent call last): 2025-12-04T11:58:24.4789314Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4789356Z getattr(self, test_name)() 2025-12-04T11:58:24.4789551Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4789586Z fn() 2025-12-04T11:58:24.4789738Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4789780Z method(*args, **kwargs) 2025-12-04T11:58:24.4789931Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4789972Z method(*args, **kwargs) 2025-12-04T11:58:24.4790121Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4790158Z with policy(): 2025-12-04T11:58:24.4790310Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4790351Z raise RuntimeError(msg) 2025-12-04T11:58:24.4790740Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4790771Z 2025-12-04T11:58:24.4790846Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4791125Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4791127Z 2025-12-04T11:58:24.4791216Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4791280Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4791343Z ======================= 1 failed, 7 deselected in 8.92s ======================== 2025-12-04T11:58:24.4791382Z Got exit code 1 2025-12-04T11:58:24.4791609Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4791739Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:58:24.4791949Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-2ae3cfecb382b0b2.xml 2025-12-04T11:58:24.4792007Z ============================= test session starts ============================== 2025-12-04T11:58:24.4792119Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4792161Z cachedir: .pytest_cache 2025-12-04T11:58:24.4792318Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4792367Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4792409Z configfile: pytest.ini 2025-12-04T11:58:24.4792572Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4792645Z collecting ... collected 8 items / 3 deselected / 5 selected 2025-12-04T11:58:24.4792699Z stepcurrent: skipping 3 already run items. 2025-12-04T11:58:24.4792742Z Running 5 items in this shard 2025-12-04T11:58:24.4792744Z 2025-12-04T11:58:24.4793100Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda I1204 11:55:31.957000 355330 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 355399 2025-12-04T11:58:24.4793254Z I1204 11:55:31.958000 355330 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 355400 2025-12-04T11:58:24.4793431Z I1204 11:55:31.959000 355330 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 355401 2025-12-04T11:58:24.4793584Z I1204 11:55:31.959000 355330 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 355402 2025-12-04T11:58:24.4794082Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4794144Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4794633Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4794694Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4795178Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4795267Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4795754Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4795811Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4795956Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4796119Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4796410Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4796567Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4796852Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4796978Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4797255Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4797403Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4797677Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4797846Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4798124Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4798306Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4798584Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4798731Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4799248Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4799396Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4799593Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4800001Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4800116Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4800330Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4800497Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4800538Z dist init r=2, world=4 2025-12-04T11:58:24.4800676Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4800837Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4801127Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4801283Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4801569Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4801693Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4801968Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4802148Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4802427Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4802575Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4802852Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4802989Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4803266Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4803414Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4803946Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:58:24.4804063Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4804258Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4804664Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4804780Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4804992Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4805157Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4805197Z dist init r=3, world=4 2025-12-04T11:58:24.4805337Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4805496Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4805786Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4805939Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4806225Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4806370Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4806645Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4806794Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4807069Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4807217Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4807493Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4807629Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4807926Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4808074Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4808628Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4808741Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4808940Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4809346Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4809460Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4809675Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4809839Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4809884Z dist init r=0, world=4 2025-12-04T11:58:24.4810022Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4810184Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4810470Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4810668Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4810955Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4811081Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4811357Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4811506Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4811789Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4811937Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4812259Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4812398Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4812680Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4812834Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4813346Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4813463Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4813659Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4814068Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4814188Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4814405Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4814577Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4814619Z dist init r=1, world=4 2025-12-04T11:58:24.4815285Z [rank0]:[W1204 11:55:39.995830833 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.4815329Z FAILED [9.1148s] [ 20%] 2025-12-04T11:58:24.4815332Z 2025-12-04T11:58:24.4815395Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4815533Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda _ 2025-12-04T11:58:24.4815589Z Traceback (most recent call last): 2025-12-04T11:58:24.4815755Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4815806Z self._join_processes(fn) 2025-12-04T11:58:24.4819171Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4819234Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4819426Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4819472Z raise RuntimeError(error) 2025-12-04T11:58:24.4819555Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4819604Z Traceback (most recent call last): 2025-12-04T11:58:24.4819771Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4819867Z getattr(self, test_name)() 2025-12-04T11:58:24.4820027Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4820064Z fn() 2025-12-04T11:58:24.4820218Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4820263Z method(*args, **kwargs) 2025-12-04T11:58:24.4820414Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4820459Z method(*args, **kwargs) 2025-12-04T11:58:24.4820610Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4820647Z with policy(): 2025-12-04T11:58:24.4820805Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4820846Z raise RuntimeError(msg) 2025-12-04T11:58:24.4821240Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4821243Z 2025-12-04T11:58:24.4821321Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4821610Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4821612Z 2025-12-04T11:58:24.4821703Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4821707Z 2025-12-04T11:58:24.4821769Z Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.4821816Z Traceback (most recent call last): 2025-12-04T11:58:24.4821983Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4822028Z getattr(self, test_name)() 2025-12-04T11:58:24.4822188Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4822225Z fn() 2025-12-04T11:58:24.4822407Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4822448Z method(*args, **kwargs) 2025-12-04T11:58:24.4822597Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4822640Z method(*args, **kwargs) 2025-12-04T11:58:24.4822792Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4822832Z with policy(): 2025-12-04T11:58:24.4822984Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4823026Z raise RuntimeError(msg) 2025-12-04T11:58:24.4823417Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4823419Z 2025-12-04T11:58:24.4823496Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4823778Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4823803Z 2025-12-04T11:58:24.4823891Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4823893Z 2025-12-04T11:58:24.4823955Z Process 3 exited with error code 10 and exception: 2025-12-04T11:58:24.4824001Z Traceback (most recent call last): 2025-12-04T11:58:24.4824167Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4824210Z getattr(self, test_name)() 2025-12-04T11:58:24.4824375Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4824410Z fn() 2025-12-04T11:58:24.4824563Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4824604Z method(*args, **kwargs) 2025-12-04T11:58:24.4824756Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4824796Z method(*args, **kwargs) 2025-12-04T11:58:24.4824949Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4824987Z with policy(): 2025-12-04T11:58:24.4825140Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4825181Z raise RuntimeError(msg) 2025-12-04T11:58:24.4825572Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:58:24.4825576Z 2025-12-04T11:58:24.4825652Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4825932Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4825934Z 2025-12-04T11:58:24.4826023Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4826025Z 2025-12-04T11:58:24.4826027Z 2025-12-04T11:58:24.4826106Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4826196Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4826473Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-2ae3cfecb382b0b2.xml - 2025-12-04T11:58:24.4826537Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4826833Z FAILED [9.1148s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4826882Z Traceback (most recent call last): 2025-12-04T11:58:24.4827045Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4827090Z getattr(self, test_name)() 2025-12-04T11:58:24.4827251Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4827288Z fn() 2025-12-04T11:58:24.4827439Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4827480Z method(*args, **kwargs) 2025-12-04T11:58:24.4827657Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4827698Z method(*args, **kwargs) 2025-12-04T11:58:24.4827848Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4827885Z with policy(): 2025-12-04T11:58:24.4828036Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4828078Z raise RuntimeError(msg) 2025-12-04T11:58:24.4828505Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4828507Z 2025-12-04T11:58:24.4828582Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4828867Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4828869Z 2025-12-04T11:58:24.4828959Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4828961Z 2025-12-04T11:58:24.4829021Z Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.4829071Z Traceback (most recent call last): 2025-12-04T11:58:24.4829236Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4829280Z getattr(self, test_name)() 2025-12-04T11:58:24.4829439Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4829476Z fn() 2025-12-04T11:58:24.4829628Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4829669Z method(*args, **kwargs) 2025-12-04T11:58:24.4829819Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4829862Z method(*args, **kwargs) 2025-12-04T11:58:24.4830012Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4830052Z with policy(): 2025-12-04T11:58:24.4830242Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4830287Z raise RuntimeError(msg) 2025-12-04T11:58:24.4830674Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4830678Z 2025-12-04T11:58:24.4830751Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4831032Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4831034Z 2025-12-04T11:58:24.4831122Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4831124Z 2025-12-04T11:58:24.4831186Z Process 3 exited with error code 10 and exception: 2025-12-04T11:58:24.4831232Z Traceback (most recent call last): 2025-12-04T11:58:24.4831397Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4831439Z getattr(self, test_name)() 2025-12-04T11:58:24.4831634Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4831669Z fn() 2025-12-04T11:58:24.4831820Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4831860Z method(*args, **kwargs) 2025-12-04T11:58:24.4832012Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4832053Z method(*args, **kwargs) 2025-12-04T11:58:24.4832206Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4832243Z with policy(): 2025-12-04T11:58:24.4832398Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4832442Z raise RuntimeError(msg) 2025-12-04T11:58:24.4832832Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:58:24.4832834Z 2025-12-04T11:58:24.4832908Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4833184Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4833188Z 2025-12-04T11:58:24.4833275Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4833341Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4833617Z ======================= 1 failed, 3 deselected in 9.13s ======================== 2025-12-04T11:58:24.4833656Z Got exit code 1 2025-12-04T11:58:24.4833700Z Retrying single test... 2025-12-04T11:58:24.4833910Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-6e9caf4cb6074b53.xml 2025-12-04T11:58:24.4833969Z ============================= test session starts ============================== 2025-12-04T11:58:24.4834083Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4834125Z cachedir: .pytest_cache 2025-12-04T11:58:24.4834306Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4834356Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4834396Z configfile: pytest.ini 2025-12-04T11:58:24.4834561Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4834640Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.4834914Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4834961Z Running 1 items in this shard 2025-12-04T11:58:24.4834964Z 2025-12-04T11:58:24.4835318Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda I1204 11:55:43.715000 355732 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 355801 2025-12-04T11:58:24.4835476Z I1204 11:55:43.716000 355732 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 355802 2025-12-04T11:58:24.4835628Z I1204 11:55:43.717000 355732 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 355803 2025-12-04T11:58:24.4835802Z I1204 11:55:43.717000 355732 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 355804 2025-12-04T11:58:24.4836300Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4836365Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4836857Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4836919Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4837406Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4837463Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4837950Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4838009Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4838189Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4838354Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4838685Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4838845Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4839133Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4839262Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4839540Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4839690Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4839967Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4840146Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4840430Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4840566Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4840847Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4840998Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4841522Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2462056448 and is now 3663724544. 2025-12-04T11:58:24.4841644Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4841840Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4842252Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4842369Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4842584Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4842752Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4842792Z dist init r=0, world=4 2025-12-04T11:58:24.4842950Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4843109Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4843397Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4843553Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4843841Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4843966Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4844245Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4844393Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4844693Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4844841Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4845116Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4845253Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4845530Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4845681Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4846203Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:58:24.4846317Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4846513Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4846923Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4847037Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4847267Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4847430Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4847472Z dist init r=3, world=4 2025-12-04T11:58:24.4847611Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4847771Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4848060Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4848263Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4848550Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4848674Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4848987Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4849135Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4849412Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4849557Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4849835Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4849971Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4850251Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4850400Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4850918Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4851034Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4851230Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4851672Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4851786Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4852000Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4852168Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4852207Z dist init r=2, world=4 2025-12-04T11:58:24.4852347Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4852505Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4852797Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4852950Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4853257Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4853380Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4853658Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4853808Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4854083Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4854233Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4854506Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4854644Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4854923Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4855072Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4855585Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4855700Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4855917Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4856324Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4856439Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4856649Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4856815Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4856856Z dist init r=1, world=4 2025-12-04T11:58:24.4857199Z [rank0]:[W1204 11:55:51.700664178 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.4857262Z FAILED [9.0131s] [100%] 2025-12-04T11:58:24.4857264Z 2025-12-04T11:58:24.4857322Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4857459Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda _ 2025-12-04T11:58:24.4857505Z Traceback (most recent call last): 2025-12-04T11:58:24.4857671Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4857715Z self._join_processes(fn) 2025-12-04T11:58:24.4857894Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4857949Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4858131Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4858216Z raise RuntimeError(error) 2025-12-04T11:58:24.4858301Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4858348Z Traceback (most recent call last): 2025-12-04T11:58:24.4858511Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4858555Z getattr(self, test_name)() 2025-12-04T11:58:24.4858713Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4858751Z fn() 2025-12-04T11:58:24.4858904Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4858948Z method(*args, **kwargs) 2025-12-04T11:58:24.4859101Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4859146Z method(*args, **kwargs) 2025-12-04T11:58:24.4859295Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4859335Z with policy(): 2025-12-04T11:58:24.4859486Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4859531Z raise RuntimeError(msg) 2025-12-04T11:58:24.4859953Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2462056448 and is now 3663724544. 2025-12-04T11:58:24.4859956Z 2025-12-04T11:58:24.4860036Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4860319Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4860323Z 2025-12-04T11:58:24.4860414Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4860416Z 2025-12-04T11:58:24.4860418Z 2025-12-04T11:58:24.4860495Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4860583Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4860838Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-6e9caf4cb6074b53.xml - 2025-12-04T11:58:24.4860899Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4861197Z FAILED [9.0131s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4861282Z Traceback (most recent call last): 2025-12-04T11:58:24.4861452Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4861495Z getattr(self, test_name)() 2025-12-04T11:58:24.4861658Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4861693Z fn() 2025-12-04T11:58:24.4861849Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4861891Z method(*args, **kwargs) 2025-12-04T11:58:24.4862043Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4862082Z method(*args, **kwargs) 2025-12-04T11:58:24.4862236Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4862276Z with policy(): 2025-12-04T11:58:24.4862428Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4862469Z raise RuntimeError(msg) 2025-12-04T11:58:24.4862857Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2462056448 and is now 3663724544. 2025-12-04T11:58:24.4862860Z 2025-12-04T11:58:24.4862938Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4863218Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4863222Z 2025-12-04T11:58:24.4863309Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4863371Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4863437Z ======================= 1 failed, 7 deselected in 9.02s ======================== 2025-12-04T11:58:24.4863475Z Got exit code 1 2025-12-04T11:58:24.4863518Z Retrying single test... 2025-12-04T11:58:24.4863727Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-74c155eb338d617d.xml 2025-12-04T11:58:24.4863806Z ============================= test session starts ============================== 2025-12-04T11:58:24.4863920Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4863961Z cachedir: .pytest_cache 2025-12-04T11:58:24.4864121Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4864169Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4864211Z configfile: pytest.ini 2025-12-04T11:58:24.4864376Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4864448Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.4864722Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4864767Z Running 1 items in this shard 2025-12-04T11:58:24.4864769Z 2025-12-04T11:58:24.4865123Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda I1204 11:55:55.294000 356134 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 356203 2025-12-04T11:58:24.4865298Z I1204 11:55:55.295000 356134 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 356204 2025-12-04T11:58:24.4865451Z I1204 11:55:55.295000 356134 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 356205 2025-12-04T11:58:24.4865601Z I1204 11:55:55.296000 356134 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 356206 2025-12-04T11:58:24.4866104Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4866168Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4866656Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4866718Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4867204Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4867266Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4867748Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4867806Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4867951Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4868132Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4868459Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4868616Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4868902Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4869027Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4869305Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4869453Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4869764Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4869911Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4870189Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4870326Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4870605Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4870755Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4871275Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4871390Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4871586Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4871994Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4872109Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4872352Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4872517Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4872558Z dist init r=1, world=4 2025-12-04T11:58:24.4872696Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4872859Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4873148Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4873303Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4873589Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4873715Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4874011Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4874158Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4874433Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4874581Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4874856Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4874993Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4875273Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4875422Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4875936Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:58:24.4876052Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4876246Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4876679Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4876794Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4877004Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4877171Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4877211Z dist init r=3, world=4 2025-12-04T11:58:24.4877348Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4877509Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4877797Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4877950Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4878343Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4878467Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4878742Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4878892Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4879167Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4879316Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4879590Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4879728Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4880008Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4880156Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4880672Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4880786Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4881016Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4881421Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4881537Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4881748Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4881911Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4881951Z dist init r=0, world=4 2025-12-04T11:58:24.4882091Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4882252Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4882568Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4882724Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4883008Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4883133Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4883408Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4883557Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4883833Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4883978Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4884254Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4884391Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4884670Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4884819Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4885348Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4885463Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4885659Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4886065Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4886178Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4886391Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4886555Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4886614Z dist init r=2, world=4 2025-12-04T11:58:24.4886955Z [rank0]:[W1204 11:56:03.658216290 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.4886995Z FAILED [9.0134s] [100%] 2025-12-04T11:58:24.4886997Z 2025-12-04T11:58:24.4887053Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4887190Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda _ 2025-12-04T11:58:24.4887239Z Traceback (most recent call last): 2025-12-04T11:58:24.4887402Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4887448Z self._join_processes(fn) 2025-12-04T11:58:24.4887621Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4887678Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4887856Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4887900Z raise RuntimeError(error) 2025-12-04T11:58:24.4887982Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.4888028Z Traceback (most recent call last): 2025-12-04T11:58:24.4888232Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4888275Z getattr(self, test_name)() 2025-12-04T11:58:24.4888435Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4888470Z fn() 2025-12-04T11:58:24.4888624Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4888665Z method(*args, **kwargs) 2025-12-04T11:58:24.4888816Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4888857Z method(*args, **kwargs) 2025-12-04T11:58:24.4889008Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4889045Z with policy(): 2025-12-04T11:58:24.4889231Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4889272Z raise RuntimeError(msg) 2025-12-04T11:58:24.4889664Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4889668Z 2025-12-04T11:58:24.4889743Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4890023Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4890025Z 2025-12-04T11:58:24.4890114Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4890117Z 2025-12-04T11:58:24.4890118Z 2025-12-04T11:58:24.4890195Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4890283Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4890533Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-74c155eb338d617d.xml - 2025-12-04T11:58:24.4890619Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4890911Z FAILED [9.0134s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.4890958Z Traceback (most recent call last): 2025-12-04T11:58:24.4891122Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4891168Z getattr(self, test_name)() 2025-12-04T11:58:24.4891329Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4891365Z fn() 2025-12-04T11:58:24.4891516Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4891559Z method(*args, **kwargs) 2025-12-04T11:58:24.4891710Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4891749Z method(*args, **kwargs) 2025-12-04T11:58:24.4891900Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4891936Z with policy(): 2025-12-04T11:58:24.4892087Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4892130Z raise RuntimeError(msg) 2025-12-04T11:58:24.4892518Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4892522Z 2025-12-04T11:58:24.4892597Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4892877Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4892879Z 2025-12-04T11:58:24.4892966Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4893030Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4893116Z ======================= 1 failed, 7 deselected in 9.02s ======================== 2025-12-04T11:58:24.4893156Z Got exit code 1 2025-12-04T11:58:24.4893382Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4893515Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:58:24.4893726Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-80c8d1bffe0a078c.xml 2025-12-04T11:58:24.4893783Z ============================= test session starts ============================== 2025-12-04T11:58:24.4893898Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4893942Z cachedir: .pytest_cache 2025-12-04T11:58:24.4894105Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4894152Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4894193Z configfile: pytest.ini 2025-12-04T11:58:24.4894354Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4894447Z collecting ... collected 8 items / 4 deselected / 4 selected 2025-12-04T11:58:24.4894500Z stepcurrent: skipping 4 already run items. 2025-12-04T11:58:24.4894546Z Running 4 items in this shard 2025-12-04T11:58:24.4894548Z 2025-12-04T11:58:24.4894902Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda I1204 11:56:07.019000 356536 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 356605 2025-12-04T11:58:24.4895059Z I1204 11:56:07.020000 356536 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 356606 2025-12-04T11:58:24.4895211Z I1204 11:56:07.020000 356536 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 356607 2025-12-04T11:58:24.4895363Z I1204 11:56:07.021000 356536 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 356608 2025-12-04T11:58:24.4895866Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4895928Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4896422Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4896483Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4896969Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4897028Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4897527Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4897589Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4897732Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4897897Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4898222Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4898381Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4898668Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4898831Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4899108Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4899256Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4899534Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4899681Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4899961Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4900098Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4900380Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4900532Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4901048Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4901166Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4901361Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4901797Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4901912Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4902125Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4902289Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4902329Z dist init r=1, world=4 2025-12-04T11:58:24.4902467Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4902627Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4902916Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4903090Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4903375Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4903500Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4903776Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4903924Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4904198Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4904347Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4904620Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4904758Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4905039Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4905188Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4905703Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4905817Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4906030Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4906438Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4906553Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4906764Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4906930Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4906970Z dist init r=2, world=4 2025-12-04T11:58:24.4907106Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4907267Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4907581Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4907736Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4908023Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4908187Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4908464Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4908612Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4908888Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4909034Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4909310Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4909448Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4909727Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4909876Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4910418Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:58:24.4910535Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4910730Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4911136Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4911252Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4911461Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4911626Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4911696Z dist init r=3, world=4 2025-12-04T11:58:24.4911835Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4911995Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4912285Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4912438Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4912722Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4912848Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4913123Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4913271Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4913547Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4913694Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4913969Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4914106Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4914404Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4914552Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4915067Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4915182Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4915380Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4915787Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4915926Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4916137Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4916301Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4916341Z dist init r=0, world=4 2025-12-04T11:58:24.4916681Z [rank0]:[W1204 11:56:14.214690072 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.4916722Z FAILED [9.1141s] [ 25%] 2025-12-04T11:58:24.4916724Z 2025-12-04T11:58:24.4916780Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4916919Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda _ 2025-12-04T11:58:24.4916965Z Traceback (most recent call last): 2025-12-04T11:58:24.4917130Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4917173Z self._join_processes(fn) 2025-12-04T11:58:24.4917348Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4917403Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4917582Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4917627Z raise RuntimeError(error) 2025-12-04T11:58:24.4917708Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.4917755Z Traceback (most recent call last): 2025-12-04T11:58:24.4917916Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4917959Z getattr(self, test_name)() 2025-12-04T11:58:24.4918116Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4918185Z fn() 2025-12-04T11:58:24.4918336Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4918378Z method(*args, **kwargs) 2025-12-04T11:58:24.4918559Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4918600Z method(*args, **kwargs) 2025-12-04T11:58:24.4918751Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4918791Z with policy(): 2025-12-04T11:58:24.4918942Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4918984Z raise RuntimeError(msg) 2025-12-04T11:58:24.4919371Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4919375Z 2025-12-04T11:58:24.4919452Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4919733Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4919767Z 2025-12-04T11:58:24.4919855Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4919857Z 2025-12-04T11:58:24.4919859Z 2025-12-04T11:58:24.4919934Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4920022Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4920273Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-80c8d1bffe0a078c.xml - 2025-12-04T11:58:24.4920333Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4920626Z FAILED [9.1141s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.4920672Z Traceback (most recent call last): 2025-12-04T11:58:24.4920838Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4920881Z getattr(self, test_name)() 2025-12-04T11:58:24.4921040Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4921075Z fn() 2025-12-04T11:58:24.4921225Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4921266Z method(*args, **kwargs) 2025-12-04T11:58:24.4921418Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4921458Z method(*args, **kwargs) 2025-12-04T11:58:24.4921607Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4921646Z with policy(): 2025-12-04T11:58:24.4921797Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4921838Z raise RuntimeError(msg) 2025-12-04T11:58:24.4922227Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4922229Z 2025-12-04T11:58:24.4922304Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4922603Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4922606Z 2025-12-04T11:58:24.4922693Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4922759Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4922820Z ======================= 1 failed, 4 deselected in 9.12s ======================== 2025-12-04T11:58:24.4922857Z Got exit code 1 2025-12-04T11:58:24.4922897Z Retrying single test... 2025-12-04T11:58:24.4923104Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-747b209bb437d791.xml 2025-12-04T11:58:24.4923161Z ============================= test session starts ============================== 2025-12-04T11:58:24.4923278Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4923319Z cachedir: .pytest_cache 2025-12-04T11:58:24.4923480Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4923555Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4923597Z configfile: pytest.ini 2025-12-04T11:58:24.4923759Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4923833Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.4924104Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4924149Z Running 1 items in this shard 2025-12-04T11:58:24.4924151Z 2025-12-04T11:58:24.4924506Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda I1204 11:56:18.639000 356938 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 357007 2025-12-04T11:58:24.4924662Z I1204 11:56:18.640000 356938 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 357008 2025-12-04T11:58:24.4924816Z I1204 11:56:18.640000 356938 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 357009 2025-12-04T11:58:24.4924966Z I1204 11:56:18.641000 356938 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 357010 2025-12-04T11:58:24.4925463Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4925525Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4926019Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4926081Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4926584Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4926643Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4927126Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4927186Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4927329Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4927493Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4927789Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4927965Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4928304Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4928429Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4928708Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4928857Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4929135Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4929284Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4929557Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4929697Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4929974Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4930127Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4930646Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4930763Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4930991Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4931400Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4931516Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4931728Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4931895Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4931934Z dist init r=0, world=4 2025-12-04T11:58:24.4932074Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4932233Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4932553Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4932708Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4932993Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4933117Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4933392Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4933542Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4933816Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4933963Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4934241Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4934377Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4934659Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4934808Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4935340Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4935456Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4935653Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4936059Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4936174Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4936385Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4936549Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4936612Z dist init r=2, world=4 2025-12-04T11:58:24.4936750Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4936912Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4937200Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4937355Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4937639Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4937764Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4938041Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4938221Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4938499Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4938645Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4938925Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4939063Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4939338Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4939526Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4940038Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:58:24.4940155Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4940351Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4940758Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4940905Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4941116Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4941281Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4941321Z dist init r=3, world=4 2025-12-04T11:58:24.4941459Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4941619Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4941906Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4942061Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4942344Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4942468Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4942743Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4942891Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4943167Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4943314Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4943592Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4943747Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4944026Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4944175Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4944689Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4944804Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4945000Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4945425Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4945538Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4945750Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4945916Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4945956Z dist init r=1, world=4 2025-12-04T11:58:24.4946292Z [rank0]:[W1204 11:56:25.574120505 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.4946335Z FAILED [9.0137s] [100%] 2025-12-04T11:58:24.4946338Z 2025-12-04T11:58:24.4946394Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4946532Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda _ 2025-12-04T11:58:24.4946579Z Traceback (most recent call last): 2025-12-04T11:58:24.4946744Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4946789Z self._join_processes(fn) 2025-12-04T11:58:24.4946963Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4947018Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4947198Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4947243Z raise RuntimeError(error) 2025-12-04T11:58:24.4947323Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4947369Z Traceback (most recent call last): 2025-12-04T11:58:24.4947530Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4947573Z getattr(self, test_name)() 2025-12-04T11:58:24.4947749Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4947785Z fn() 2025-12-04T11:58:24.4947937Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4947980Z method(*args, **kwargs) 2025-12-04T11:58:24.4948131Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4948212Z method(*args, **kwargs) 2025-12-04T11:58:24.4948362Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4948399Z with policy(): 2025-12-04T11:58:24.4948551Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4948594Z raise RuntimeError(msg) 2025-12-04T11:58:24.4948982Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4949466Z 2025-12-04T11:58:24.4949542Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4949824Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4949827Z 2025-12-04T11:58:24.4949913Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4949916Z 2025-12-04T11:58:24.4949918Z 2025-12-04T11:58:24.4949993Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4950082Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4950333Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-747b209bb437d791.xml - 2025-12-04T11:58:24.4950393Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4950693Z FAILED [9.0137s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4950739Z Traceback (most recent call last): 2025-12-04T11:58:24.4950904Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4950947Z getattr(self, test_name)() 2025-12-04T11:58:24.4951107Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4951143Z fn() 2025-12-04T11:58:24.4951294Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4951335Z method(*args, **kwargs) 2025-12-04T11:58:24.4951485Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4951526Z method(*args, **kwargs) 2025-12-04T11:58:24.4951676Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4951715Z with policy(): 2025-12-04T11:58:24.4951866Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4951908Z raise RuntimeError(msg) 2025-12-04T11:58:24.4952328Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4952330Z 2025-12-04T11:58:24.4952406Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4952688Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4952690Z 2025-12-04T11:58:24.4952776Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4952840Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4952902Z ======================= 1 failed, 7 deselected in 9.02s ======================== 2025-12-04T11:58:24.4952940Z Got exit code 1 2025-12-04T11:58:24.4952981Z Retrying single test... 2025-12-04T11:58:24.4953191Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-c8863613f03f4d7f.xml 2025-12-04T11:58:24.4953248Z ============================= test session starts ============================== 2025-12-04T11:58:24.4953382Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4953423Z cachedir: .pytest_cache 2025-12-04T11:58:24.4953582Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4953629Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4953670Z configfile: pytest.ini 2025-12-04T11:58:24.4953834Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4953908Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.4954181Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4954227Z Running 1 items in this shard 2025-12-04T11:58:24.4954230Z 2025-12-04T11:58:24.4954583Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda I1204 11:56:30.136000 357340 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 357409 2025-12-04T11:58:24.4954738Z I1204 11:56:30.136000 357340 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 357410 2025-12-04T11:58:24.4954890Z I1204 11:56:30.137000 357340 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 357411 2025-12-04T11:58:24.4955043Z I1204 11:56:30.137000 357340 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 357412 2025-12-04T11:58:24.4955542Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4955605Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4956093Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4956180Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4956663Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4956724Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4957205Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4957264Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4957408Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4957573Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4957884Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4958038Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4958360Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4958485Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4958762Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4958910Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4959187Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4959334Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4959611Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4959748Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4960029Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4960178Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4960727Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4960844Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4961041Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4961449Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4961563Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4961776Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4961940Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4962002Z dist init r=0, world=4 2025-12-04T11:58:24.4962142Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4962302Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4962588Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4962743Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4963026Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4963153Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4963428Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4963576Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4963853Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4963999Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4964275Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4964412Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4964690Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4964856Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4965371Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4965488Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4965684Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4966092Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4966224Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4966436Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4966599Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4966639Z dist init r=1, world=4 2025-12-04T11:58:24.4966780Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4966943Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4967230Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4967385Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4967670Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4967793Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4968071Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4968261Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4968539Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4968686Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4968963Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4969125Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4969402Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4969552Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4970063Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.4970180Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4970376Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4971040Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4971207Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4971532Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4971817Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.4971874Z dist init r=2, world=4 2025-12-04T11:58:24.4972090Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4972336Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4972802Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4973036Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4973462Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4973661Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4974092Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4974317Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4974742Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4975042Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4975497Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4975711Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4976076Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4976225Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4977005Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:58:24.4977150Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4977385Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4977812Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4977928Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4978237Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4978461Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.4978503Z dist init r=3, world=4 2025-12-04T11:58:24.4978843Z [rank0]:[W1204 11:56:37.004034589 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.4978895Z FAILED [8.8128s] [100%] 2025-12-04T11:58:24.4978899Z 2025-12-04T11:58:24.4978975Z =================================== FAILURES =================================== 2025-12-04T11:58:24.4979113Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda _ 2025-12-04T11:58:24.4979162Z Traceback (most recent call last): 2025-12-04T11:58:24.4979326Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.4979373Z self._join_processes(fn) 2025-12-04T11:58:24.4979595Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.4979678Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.4979859Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.4979904Z raise RuntimeError(error) 2025-12-04T11:58:24.4979985Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4980072Z Traceback (most recent call last): 2025-12-04T11:58:24.4980234Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4980278Z getattr(self, test_name)() 2025-12-04T11:58:24.4980440Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4980477Z fn() 2025-12-04T11:58:24.4980629Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4980671Z method(*args, **kwargs) 2025-12-04T11:58:24.4980822Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4980864Z method(*args, **kwargs) 2025-12-04T11:58:24.4981015Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4981055Z with policy(): 2025-12-04T11:58:24.4981206Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4981249Z raise RuntimeError(msg) 2025-12-04T11:58:24.4981673Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4981675Z 2025-12-04T11:58:24.4981751Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4982031Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4982034Z 2025-12-04T11:58:24.4982125Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4982127Z 2025-12-04T11:58:24.4982129Z 2025-12-04T11:58:24.4982206Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.4982297Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.4982550Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-c8863613f03f4d7f.xml - 2025-12-04T11:58:24.4982612Z =========================== short test summary info ============================ 2025-12-04T11:58:24.4982905Z FAILED [8.8128s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.4982953Z Traceback (most recent call last): 2025-12-04T11:58:24.4983120Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4983164Z getattr(self, test_name)() 2025-12-04T11:58:24.4983322Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4983359Z fn() 2025-12-04T11:58:24.4983510Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4983551Z method(*args, **kwargs) 2025-12-04T11:58:24.4983700Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4983741Z method(*args, **kwargs) 2025-12-04T11:58:24.4983890Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4983948Z with policy(): 2025-12-04T11:58:24.4984099Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4984141Z raise RuntimeError(msg) 2025-12-04T11:58:24.4984528Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4984533Z 2025-12-04T11:58:24.4984608Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4984888Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4984891Z 2025-12-04T11:58:24.4984979Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4985043Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.4985105Z ======================= 1 failed, 7 deselected in 8.82s ======================== 2025-12-04T11:58:24.4985170Z Got exit code 1 2025-12-04T11:58:24.4985399Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:58:24.4985531Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:58:24.4985741Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-fcbfdc75c93b62ce.xml 2025-12-04T11:58:24.4985799Z ============================= test session starts ============================== 2025-12-04T11:58:24.4985915Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.4985958Z cachedir: .pytest_cache 2025-12-04T11:58:24.4986117Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.4986166Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.4986206Z configfile: pytest.ini 2025-12-04T11:58:24.4986371Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.4986445Z collecting ... collected 8 items / 5 deselected / 3 selected 2025-12-04T11:58:24.4986497Z stepcurrent: skipping 5 already run items. 2025-12-04T11:58:24.4986542Z Running 3 items in this shard 2025-12-04T11:58:24.4986544Z 2025-12-04T11:58:24.4986902Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda I1204 11:56:41.476000 357742 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 357811 2025-12-04T11:58:24.4987059Z I1204 11:56:41.477000 357742 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 357812 2025-12-04T11:58:24.4987210Z I1204 11:56:41.477000 357742 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 357813 2025-12-04T11:58:24.4987362Z I1204 11:56:41.478000 357742 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 357814 2025-12-04T11:58:24.4987869Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4987955Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4988522Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4988585Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4989072Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4989132Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4989614Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.4989722Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.4989866Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4990031Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4990326Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4990482Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4990769Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4990894Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4991173Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4991321Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4991597Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4991745Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4992020Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4992156Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4992472Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4992620Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4993137Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.4993253Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4993448Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4993857Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4993991Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4994203Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4994368Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.4994407Z dist init r=0, world=4 2025-12-04T11:58:24.4994548Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4994708Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4994996Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4995151Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.4995434Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.4995560Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.4995836Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4995985Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4996259Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.4996405Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.4996698Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.4996836Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.4997115Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.4997265Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.4997780Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.4997895Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4998091Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.4998566Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.4998681Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.4998893Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.4999058Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.4999098Z dist init r=1, world=4 2025-12-04T11:58:24.4999237Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.4999398Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.4999684Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.4999838Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5000122Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5000248Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5000524Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5000671Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5000986Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5001132Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5001407Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5001546Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5001824Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5001971Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5002485Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.5002652Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5002879Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5003288Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5003401Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5003612Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5003780Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.5003818Z dist init r=2, world=4 2025-12-04T11:58:24.5003958Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5004116Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5004404Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5004557Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5004842Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5004964Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5005240Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5005413Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5005687Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5005835Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5006113Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5006251Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5006528Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5006676Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5007207Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:58:24.5007321Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5007517Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5007924Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5008040Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5008301Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5008467Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.5008509Z dist init r=3, world=4 2025-12-04T11:58:24.5008847Z [rank0]:[W1204 11:56:48.316232342 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.5008889Z FAILED [8.8143s] [ 33%] 2025-12-04T11:58:24.5008891Z 2025-12-04T11:58:24.5008948Z =================================== FAILURES =================================== 2025-12-04T11:58:24.5009084Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda _ 2025-12-04T11:58:24.5009131Z Traceback (most recent call last): 2025-12-04T11:58:24.5009295Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.5009339Z self._join_processes(fn) 2025-12-04T11:58:24.5009554Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.5009610Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.5009790Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.5009835Z raise RuntimeError(error) 2025-12-04T11:58:24.5009917Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.5009963Z Traceback (most recent call last): 2025-12-04T11:58:24.5010125Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5010168Z getattr(self, test_name)() 2025-12-04T11:58:24.5010326Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5010361Z fn() 2025-12-04T11:58:24.5010514Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5010556Z method(*args, **kwargs) 2025-12-04T11:58:24.5010706Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5010780Z method(*args, **kwargs) 2025-12-04T11:58:24.5010930Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5010968Z with policy(): 2025-12-04T11:58:24.5011119Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5011162Z raise RuntimeError(msg) 2025-12-04T11:58:24.5011550Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.5011553Z 2025-12-04T11:58:24.5011635Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5011915Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5011919Z 2025-12-04T11:58:24.5012008Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5012010Z 2025-12-04T11:58:24.5012012Z 2025-12-04T11:58:24.5012087Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.5012176Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.5012431Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-fcbfdc75c93b62ce.xml - 2025-12-04T11:58:24.5012492Z =========================== short test summary info ============================ 2025-12-04T11:58:24.5012786Z FAILED [8.8143s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.5012834Z Traceback (most recent call last): 2025-12-04T11:58:24.5013000Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5013043Z getattr(self, test_name)() 2025-12-04T11:58:24.5013205Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5013260Z fn() 2025-12-04T11:58:24.5013454Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5013495Z method(*args, **kwargs) 2025-12-04T11:58:24.5013646Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5013686Z method(*args, **kwargs) 2025-12-04T11:58:24.5013839Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5013876Z with policy(): 2025-12-04T11:58:24.5014029Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5014069Z raise RuntimeError(msg) 2025-12-04T11:58:24.5014460Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.5014462Z 2025-12-04T11:58:24.5014537Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5014817Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5014840Z 2025-12-04T11:58:24.5014929Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5014991Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.5015055Z ======================= 1 failed, 5 deselected in 8.83s ======================== 2025-12-04T11:58:24.5015092Z Got exit code 1 2025-12-04T11:58:24.5015133Z Retrying single test... 2025-12-04T11:58:24.5015343Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-681abe36e7aff16a.xml 2025-12-04T11:58:24.5015401Z ============================= test session starts ============================== 2025-12-04T11:58:24.5015514Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.5015558Z cachedir: .pytest_cache 2025-12-04T11:58:24.5015717Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.5015766Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.5015806Z configfile: pytest.ini 2025-12-04T11:58:24.5015970Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.5016043Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.5016317Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5016362Z Running 1 items in this shard 2025-12-04T11:58:24.5016364Z 2025-12-04T11:58:24.5016715Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda I1204 11:56:52.889000 358144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 358213 2025-12-04T11:58:24.5016870Z I1204 11:56:52.890000 358144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 358214 2025-12-04T11:58:24.5017022Z I1204 11:56:52.890000 358144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 358215 2025-12-04T11:58:24.5017174Z I1204 11:56:52.891000 358144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 358216 2025-12-04T11:58:24.5017690Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5017754Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5018278Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5018338Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5018827Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5018929Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5019415Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5019473Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5019619Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5019783Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5020076Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5020235Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5020522Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5020648Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5020928Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5021077Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5021353Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5021499Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5021810Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5021947Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5022226Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5022378Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5022895Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.5023011Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5023206Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5023633Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5023749Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5023963Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5024128Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.5024167Z dist init r=1, world=4 2025-12-04T11:58:24.5024307Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5024466Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5024752Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5024905Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5025192Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5025316Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5025595Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5025741Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5026037Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5026186Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5026462Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5026601Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5026877Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5027026Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5027541Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.5027675Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5027870Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5028331Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5028447Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5028660Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5028826Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.5028867Z dist init r=2, world=4 2025-12-04T11:58:24.5029004Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5029163Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5029451Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5029605Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5029890Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5030015Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5030291Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5030469Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5030746Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5030894Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5031168Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5031303Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5031582Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5031730Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5032282Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.5032395Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5032591Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5033001Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5033117Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5033328Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5033492Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.5033531Z dist init r=0, world=4 2025-12-04T11:58:24.5033670Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5033830Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5034118Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5034271Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5034555Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5034698Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5034974Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5035123Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5035400Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5035548Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5035825Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5035961Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5036264Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5036413Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5036926Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:58:24.5037039Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5037236Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5037644Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5037758Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5037971Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5038135Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.5038209Z dist init r=3, world=4 2025-12-04T11:58:24.5038546Z [rank0]:[W1204 11:57:00.977697371 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.5038587Z FAILED [9.2133s] [100%] 2025-12-04T11:58:24.5038589Z 2025-12-04T11:58:24.5038646Z =================================== FAILURES =================================== 2025-12-04T11:58:24.5038782Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda _ 2025-12-04T11:58:24.5038859Z Traceback (most recent call last): 2025-12-04T11:58:24.5039023Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.5039068Z self._join_processes(fn) 2025-12-04T11:58:24.5039245Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.5039299Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.5039478Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.5039521Z raise RuntimeError(error) 2025-12-04T11:58:24.5039603Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.5039649Z Traceback (most recent call last): 2025-12-04T11:58:24.5039814Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5039858Z getattr(self, test_name)() 2025-12-04T11:58:24.5040016Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5040084Z fn() 2025-12-04T11:58:24.5040235Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5040277Z method(*args, **kwargs) 2025-12-04T11:58:24.5040428Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5040469Z method(*args, **kwargs) 2025-12-04T11:58:24.5040618Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5040656Z with policy(): 2025-12-04T11:58:24.5040810Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5040852Z raise RuntimeError(msg) 2025-12-04T11:58:24.5041239Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.5041243Z 2025-12-04T11:58:24.5041319Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5041599Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5041602Z 2025-12-04T11:58:24.5041690Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5041692Z 2025-12-04T11:58:24.5041754Z Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.5041800Z Traceback (most recent call last): 2025-12-04T11:58:24.5041963Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5042006Z getattr(self, test_name)() 2025-12-04T11:58:24.5042166Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5042200Z fn() 2025-12-04T11:58:24.5042352Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5042392Z method(*args, **kwargs) 2025-12-04T11:58:24.5042542Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5042582Z method(*args, **kwargs) 2025-12-04T11:58:24.5042749Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5042786Z with policy(): 2025-12-04T11:58:24.5042937Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5042980Z raise RuntimeError(msg) 2025-12-04T11:58:24.5043365Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.5043368Z 2025-12-04T11:58:24.5043442Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5043721Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5043724Z 2025-12-04T11:58:24.5043811Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5043813Z 2025-12-04T11:58:24.5043815Z 2025-12-04T11:58:24.5043890Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.5044003Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.5044253Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-681abe36e7aff16a.xml - 2025-12-04T11:58:24.5044316Z =========================== short test summary info ============================ 2025-12-04T11:58:24.5044610Z FAILED [9.2133s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.5044661Z Traceback (most recent call last): 2025-12-04T11:58:24.5044824Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5044869Z getattr(self, test_name)() 2025-12-04T11:58:24.5045031Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5045066Z fn() 2025-12-04T11:58:24.5045218Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5045258Z method(*args, **kwargs) 2025-12-04T11:58:24.5045410Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5045450Z method(*args, **kwargs) 2025-12-04T11:58:24.5045604Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5045641Z with policy(): 2025-12-04T11:58:24.5045794Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5045835Z raise RuntimeError(msg) 2025-12-04T11:58:24.5046224Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.5046226Z 2025-12-04T11:58:24.5046300Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5046579Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5046599Z 2025-12-04T11:58:24.5046690Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5046692Z 2025-12-04T11:58:24.5046752Z Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.5046800Z Traceback (most recent call last): 2025-12-04T11:58:24.5046965Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5047009Z getattr(self, test_name)() 2025-12-04T11:58:24.5047169Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5047207Z fn() 2025-12-04T11:58:24.5047358Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5047400Z method(*args, **kwargs) 2025-12-04T11:58:24.5047551Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5047594Z method(*args, **kwargs) 2025-12-04T11:58:24.5047746Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5050838Z with policy(): 2025-12-04T11:58:24.5051003Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5051045Z raise RuntimeError(msg) 2025-12-04T11:58:24.5051438Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.5051440Z 2025-12-04T11:58:24.5051515Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5051800Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5051803Z 2025-12-04T11:58:24.5051892Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5051962Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.5052026Z ======================= 1 failed, 7 deselected in 9.22s ======================== 2025-12-04T11:58:24.5052064Z Got exit code 1 2025-12-04T11:58:24.5052106Z Retrying single test... 2025-12-04T11:58:24.5052316Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-993e4ba5ce1d3537.xml 2025-12-04T11:58:24.5052375Z ============================= test session starts ============================== 2025-12-04T11:58:24.5052491Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.5052534Z cachedir: .pytest_cache 2025-12-04T11:58:24.5052693Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.5052743Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.5052783Z configfile: pytest.ini 2025-12-04T11:58:24.5052949Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.5053022Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.5053299Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5053343Z Running 1 items in this shard 2025-12-04T11:58:24.5053346Z 2025-12-04T11:58:24.5053759Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda I1204 11:57:04.434000 358546 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 358615 2025-12-04T11:58:24.5053919Z I1204 11:57:04.435000 358546 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 358616 2025-12-04T11:58:24.5054072Z I1204 11:57:04.435000 358546 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 358617 2025-12-04T11:58:24.5054223Z I1204 11:57:04.436000 358546 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 358618 2025-12-04T11:58:24.5054725Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5054789Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5055276Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5055384Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5055871Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5055930Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5056412Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5056469Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5056617Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5056784Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5057081Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5057239Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5057526Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5057655Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5057953Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5058104Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5058416Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5058566Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5058841Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5058982Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5059263Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5059447Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5059969Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:58:24.5060086Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5060283Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5060695Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5060812Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5061027Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5061193Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.5061233Z dist init r=3, world=4 2025-12-04T11:58:24.5061376Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5061536Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5061825Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5061978Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5062293Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5062418Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5062693Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5062841Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5063118Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5063265Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5063542Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5063698Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5063975Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5064122Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5064636Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:58:24.5064752Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5064949Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5065355Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5065472Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5065684Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5065851Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.5065892Z dist init r=2, world=4 2025-12-04T11:58:24.5066031Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5066193Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5066479Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5066652Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5066936Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5067064Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5067339Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5067487Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5067764Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5067910Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5068241Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5068377Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5068657Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5068804Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5069321Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:58:24.5069439Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5069635Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5070042Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5070157Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5070370Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5070536Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.5070575Z dist init r=1, world=4 2025-12-04T11:58:24.5070712Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5070912Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5071201Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5071357Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5071642Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5071766Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5072045Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5072191Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5072500Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5072646Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5072923Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5073060Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5073337Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5073487Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5073998Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:58:24.5074113Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5074308Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5074714Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5074829Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5075042Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5075227Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.5075266Z dist init r=0, world=4 2025-12-04T11:58:24.5075606Z [rank0]:[W1204 11:57:12.681914822 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:58:24.5075648Z FAILED [9.2146s] [100%] 2025-12-04T11:58:24.5075651Z 2025-12-04T11:58:24.5075708Z =================================== FAILURES =================================== 2025-12-04T11:58:24.5075845Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda _ 2025-12-04T11:58:24.5075892Z Traceback (most recent call last): 2025-12-04T11:58:24.5076058Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.5076102Z self._join_processes(fn) 2025-12-04T11:58:24.5076278Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.5076352Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.5076532Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.5076576Z raise RuntimeError(error) 2025-12-04T11:58:24.5076659Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:58:24.5076704Z Traceback (most recent call last): 2025-12-04T11:58:24.5076866Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5076909Z getattr(self, test_name)() 2025-12-04T11:58:24.5077068Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5077102Z fn() 2025-12-04T11:58:24.5077256Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5077298Z method(*args, **kwargs) 2025-12-04T11:58:24.5077450Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5077490Z method(*args, **kwargs) 2025-12-04T11:58:24.5077641Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5077678Z with policy(): 2025-12-04T11:58:24.5077831Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5077875Z raise RuntimeError(msg) 2025-12-04T11:58:24.5078308Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:58:24.5078312Z 2025-12-04T11:58:24.5078389Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5078670Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5078673Z 2025-12-04T11:58:24.5078761Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5078763Z 2025-12-04T11:58:24.5078765Z 2025-12-04T11:58:24.5078842Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.5078963Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.5079217Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-993e4ba5ce1d3537.xml - 2025-12-04T11:58:24.5079277Z =========================== short test summary info ============================ 2025-12-04T11:58:24.5079571Z FAILED [9.2146s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:58:24.5079617Z Traceback (most recent call last): 2025-12-04T11:58:24.5079782Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5079825Z getattr(self, test_name)() 2025-12-04T11:58:24.5079988Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5080022Z fn() 2025-12-04T11:58:24.5080174Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5080214Z method(*args, **kwargs) 2025-12-04T11:58:24.5080396Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5080435Z method(*args, **kwargs) 2025-12-04T11:58:24.5080584Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5080621Z with policy(): 2025-12-04T11:58:24.5080773Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5080813Z raise RuntimeError(msg) 2025-12-04T11:58:24.5081207Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:58:24.5081209Z 2025-12-04T11:58:24.5081284Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5081565Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5081567Z 2025-12-04T11:58:24.5081654Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5081717Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.5081780Z ======================= 1 failed, 7 deselected in 9.22s ======================== 2025-12-04T11:58:24.5081817Z Got exit code 1 2025-12-04T11:58:24.5082050Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:58:24.5082179Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:58:24.5082389Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-21ec418542ac0484.xml 2025-12-04T11:58:24.5082446Z ============================= test session starts ============================== 2025-12-04T11:58:24.5082560Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.5082601Z cachedir: .pytest_cache 2025-12-04T11:58:24.5082759Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.5082806Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.5082864Z configfile: pytest.ini 2025-12-04T11:58:24.5083029Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.5083101Z collecting ... collected 8 items / 6 deselected / 2 selected 2025-12-04T11:58:24.5083157Z stepcurrent: skipping 6 already run items. 2025-12-04T11:58:24.5083202Z Running 2 items in this shard 2025-12-04T11:58:24.5083204Z 2025-12-04T11:58:24.5083507Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda I1204 11:57:16.270000 358948 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 359017 2025-12-04T11:58:24.5083661Z I1204 11:57:16.271000 358948 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 359018 2025-12-04T11:58:24.5083817Z I1204 11:57:16.271000 358948 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 359019 2025-12-04T11:58:24.5083970Z I1204 11:57:16.272000 358948 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 359020 2025-12-04T11:58:24.5084473Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5084557Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5085047Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5085107Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5085591Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5085652Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5086137Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5086196Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5086341Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5086506Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5086800Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5086954Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5087267Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5087392Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5087671Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5087821Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5088096Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5088285Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5088560Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5088733Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5089012Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5089162Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5089630Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5089747Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5089943Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5090296Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5090411Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5090623Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5090787Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.5090829Z dist init r=1, world=4 2025-12-04T11:58:24.5090967Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5091127Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5091417Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5091600Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5091883Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5092010Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5092283Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5092431Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5092710Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5092855Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5093152Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5093288Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5093568Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5093718Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5094180Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 0. CUDA driver allocated memory was 2462056448 and is now 3665821696. 2025-12-04T11:58:24.5094297Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5094492Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5094846Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5094958Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5095171Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5095336Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.5095376Z dist init r=0, world=4 2025-12-04T11:58:24.5095513Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5095694Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5095983Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5096137Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5096423Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5096546Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5096823Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5096970Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5097245Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5097412Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5097686Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5097824Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5098100Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5098289Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5098750Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5098866Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5099066Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5099417Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5099532Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5099743Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5099907Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.5099975Z dist init r=2, world=4 2025-12-04T11:58:24.5100115Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5100275Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5100564Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5100719Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5101003Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5101129Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5101407Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5101595Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5101869Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5102015Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5102291Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5102425Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5102703Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5102849Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5103315Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:58:24.5103430Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5103628Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5103982Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5104094Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5104323Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5104487Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.5104528Z dist init r=3, world=4 2025-12-04T11:58:24.5104566Z FAILED [8.8123s] [ 50%] 2025-12-04T11:58:24.5104569Z 2025-12-04T11:58:24.5104625Z =================================== FAILURES =================================== 2025-12-04T11:58:24.5104722Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda _________ 2025-12-04T11:58:24.5104769Z Traceback (most recent call last): 2025-12-04T11:58:24.5104933Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.5104976Z self._join_processes(fn) 2025-12-04T11:58:24.5105152Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.5105207Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.5105385Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.5105451Z raise RuntimeError(error) 2025-12-04T11:58:24.5105532Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.5105579Z Traceback (most recent call last): 2025-12-04T11:58:24.5105739Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5105782Z getattr(self, test_name)() 2025-12-04T11:58:24.5105942Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5105976Z fn() 2025-12-04T11:58:24.5106129Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5106171Z method(*args, **kwargs) 2025-12-04T11:58:24.5106320Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5106362Z method(*args, **kwargs) 2025-12-04T11:58:24.5106511Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5106548Z with policy(): 2025-12-04T11:58:24.5106699Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5106741Z raise RuntimeError(msg) 2025-12-04T11:58:24.5107078Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5107081Z 2025-12-04T11:58:24.5107157Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5107383Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5107387Z 2025-12-04T11:58:24.5107475Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5107477Z 2025-12-04T11:58:24.5107478Z 2025-12-04T11:58:24.5107555Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.5107642Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.5107891Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-21ec418542ac0484.xml - 2025-12-04T11:58:24.5107969Z =========================== short test summary info ============================ 2025-12-04T11:58:24.5108248Z FAILED [8.8123s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.5108296Z Traceback (most recent call last): 2025-12-04T11:58:24.5108462Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5108504Z getattr(self, test_name)() 2025-12-04T11:58:24.5108663Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5108698Z fn() 2025-12-04T11:58:24.5108848Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5108889Z method(*args, **kwargs) 2025-12-04T11:58:24.5109039Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5109079Z method(*args, **kwargs) 2025-12-04T11:58:24.5109227Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5109299Z with policy(): 2025-12-04T11:58:24.5109449Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5109490Z raise RuntimeError(msg) 2025-12-04T11:58:24.5109827Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5109829Z 2025-12-04T11:58:24.5109903Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5110127Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5110130Z 2025-12-04T11:58:24.5110217Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5110282Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.5110345Z ======================= 1 failed, 6 deselected in 8.82s ======================== 2025-12-04T11:58:24.5110381Z Got exit code 1 2025-12-04T11:58:24.5110422Z Retrying single test... 2025-12-04T11:58:24.5110629Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-a2b976bde10bee43.xml 2025-12-04T11:58:24.5110686Z ============================= test session starts ============================== 2025-12-04T11:58:24.5110801Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.5110842Z cachedir: .pytest_cache 2025-12-04T11:58:24.5111005Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.5111054Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.5111098Z configfile: pytest.ini 2025-12-04T11:58:24.5111262Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.5111337Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.5111553Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5111599Z Running 1 items in this shard 2025-12-04T11:58:24.5111601Z 2025-12-04T11:58:24.5111936Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda I1204 11:57:27.659000 359342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 359411 2025-12-04T11:58:24.5112094Z I1204 11:57:27.659000 359342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 359412 2025-12-04T11:58:24.5112247Z I1204 11:57:27.660000 359342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 359413 2025-12-04T11:58:24.5112398Z I1204 11:57:27.660000 359342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 359414 2025-12-04T11:58:24.5112899Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5112963Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5113454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5113533Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5114019Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5114079Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5114564Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5114623Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5114767Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5114931Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5115221Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5115377Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5115668Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5115794Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5116072Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5116240Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5116518Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5116667Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5116942Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5117078Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5117357Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5117506Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5118001Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5118119Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5118350Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5118704Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5118821Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5119032Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5119197Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.5119236Z dist init r=2, world=4 2025-12-04T11:58:24.5119376Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5119534Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5119820Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5119975Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5120263Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5120388Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5120697Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5120845Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5121123Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5121270Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5121547Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5121684Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5121961Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5122142Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5122609Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5122725Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5122921Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5123274Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5123389Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5123602Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5123768Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.5123808Z dist init r=1, world=4 2025-12-04T11:58:24.5123945Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5124106Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5124391Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5124546Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5124852Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5124978Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5125255Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5125402Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5125677Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5125824Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5126099Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5126255Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5126533Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5126681Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5127148Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:58:24.5127264Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5127459Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5127810Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5127924Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5128136Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5128353Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.5128393Z dist init r=0, world=4 2025-12-04T11:58:24.5128531Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5128690Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5129011Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5129165Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5129452Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5129578Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5129854Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5130002Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5130278Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5130426Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5130741Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5130876Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5131154Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5131302Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5131768Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:58:24.5131884Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5132080Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5132431Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5132545Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5132757Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5132921Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.5132959Z dist init r=3, world=4 2025-12-04T11:58:24.5132998Z FAILED [8.9117s] [100%] 2025-12-04T11:58:24.5133000Z 2025-12-04T11:58:24.5133059Z =================================== FAILURES =================================== 2025-12-04T11:58:24.5133180Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda _________ 2025-12-04T11:58:24.5133229Z Traceback (most recent call last): 2025-12-04T11:58:24.5133393Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.5133438Z self._join_processes(fn) 2025-12-04T11:58:24.5133610Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.5133666Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.5133845Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.5133889Z raise RuntimeError(error) 2025-12-04T11:58:24.5133970Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.5134017Z Traceback (most recent call last): 2025-12-04T11:58:24.5134179Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5134222Z getattr(self, test_name)() 2025-12-04T11:58:24.5134379Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5134437Z fn() 2025-12-04T11:58:24.5134587Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5134630Z method(*args, **kwargs) 2025-12-04T11:58:24.5134779Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5134820Z method(*args, **kwargs) 2025-12-04T11:58:24.5134970Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5135009Z with policy(): 2025-12-04T11:58:24.5135163Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5135205Z raise RuntimeError(msg) 2025-12-04T11:58:24.5135541Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5135545Z 2025-12-04T11:58:24.5135621Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5135846Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5135848Z 2025-12-04T11:58:24.5135937Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5135939Z 2025-12-04T11:58:24.5136001Z Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.5136047Z Traceback (most recent call last): 2025-12-04T11:58:24.5136211Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5136253Z getattr(self, test_name)() 2025-12-04T11:58:24.5136414Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5136448Z fn() 2025-12-04T11:58:24.5136599Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5136639Z method(*args, **kwargs) 2025-12-04T11:58:24.5136789Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5136829Z method(*args, **kwargs) 2025-12-04T11:58:24.5136997Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5137034Z with policy(): 2025-12-04T11:58:24.5137186Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5137229Z raise RuntimeError(msg) 2025-12-04T11:58:24.5137564Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5137566Z 2025-12-04T11:58:24.5137641Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5137864Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5137866Z 2025-12-04T11:58:24.5137956Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5137958Z 2025-12-04T11:58:24.5137960Z 2025-12-04T11:58:24.5138036Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.5138125Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.5138441Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-a2b976bde10bee43.xml - 2025-12-04T11:58:24.5138503Z =========================== short test summary info ============================ 2025-12-04T11:58:24.5138746Z FAILED [8.9117s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.5138793Z Traceback (most recent call last): 2025-12-04T11:58:24.5138958Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5139000Z getattr(self, test_name)() 2025-12-04T11:58:24.5139159Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5139196Z fn() 2025-12-04T11:58:24.5139347Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5139386Z method(*args, **kwargs) 2025-12-04T11:58:24.5139536Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5139576Z method(*args, **kwargs) 2025-12-04T11:58:24.5139725Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5139762Z with policy(): 2025-12-04T11:58:24.5139916Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5139956Z raise RuntimeError(msg) 2025-12-04T11:58:24.5140294Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5140298Z 2025-12-04T11:58:24.5140372Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5140596Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5140598Z 2025-12-04T11:58:24.5140685Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5140687Z 2025-12-04T11:58:24.5140746Z Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.5140825Z Traceback (most recent call last): 2025-12-04T11:58:24.5140986Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5141028Z getattr(self, test_name)() 2025-12-04T11:58:24.5141188Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5141223Z fn() 2025-12-04T11:58:24.5141373Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5141413Z method(*args, **kwargs) 2025-12-04T11:58:24.5141561Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5141601Z method(*args, **kwargs) 2025-12-04T11:58:24.5141751Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5141789Z with policy(): 2025-12-04T11:58:24.5141941Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5141982Z raise RuntimeError(msg) 2025-12-04T11:58:24.5142347Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5142351Z 2025-12-04T11:58:24.5142423Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5142645Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5142647Z 2025-12-04T11:58:24.5142734Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5142798Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.5142860Z ======================= 1 failed, 7 deselected in 8.92s ======================== 2025-12-04T11:58:24.5142900Z Got exit code 1 2025-12-04T11:58:24.5142940Z Retrying single test... 2025-12-04T11:58:24.5143146Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-40549f4e9028d159.xml 2025-12-04T11:58:24.5143204Z ============================= test session starts ============================== 2025-12-04T11:58:24.5143317Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.5143357Z cachedir: .pytest_cache 2025-12-04T11:58:24.5143516Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.5143563Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.5143604Z configfile: pytest.ini 2025-12-04T11:58:24.5143767Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.5143840Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.5144060Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5144105Z Running 1 items in this shard 2025-12-04T11:58:24.5144107Z 2025-12-04T11:58:24.5144409Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda I1204 11:57:39.158000 359736 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 359805 2025-12-04T11:58:24.5144563Z I1204 11:57:39.159000 359736 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 359806 2025-12-04T11:58:24.5144735Z I1204 11:57:39.159000 359736 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 359807 2025-12-04T11:58:24.5144886Z I1204 11:57:39.161000 359736 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 359808 2025-12-04T11:58:24.5145385Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5145447Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5145937Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5145997Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5146500Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5146560Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5147047Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5147106Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5147252Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5147416Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5147707Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5147864Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5148192Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5148319Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5148597Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5148745Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5149062Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5149210Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5149485Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5149624Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5149903Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5150054Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5150519Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5150667Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5150865Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5151218Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5151333Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5151544Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5151711Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.5151750Z dist init r=2, world=4 2025-12-04T11:58:24.5151889Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5152050Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5152339Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5152494Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5152780Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5152905Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5153180Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5153346Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5153622Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5153771Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5154046Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5154181Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5154462Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5154610Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5155094Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5155210Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5155406Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5155758Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5155873Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5156085Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5156249Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.5156289Z dist init r=1, world=4 2025-12-04T11:58:24.5156427Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5156588Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5156875Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5157030Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5157315Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5157438Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5157732Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5157881Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5158191Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5158338Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5158613Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5158750Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5159029Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5159212Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5159674Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:58:24.5159790Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5159985Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5160337Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5160451Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5160661Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5160827Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.5160865Z dist init r=0, world=4 2025-12-04T11:58:24.5161003Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5161164Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5161452Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5161606Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5161920Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5162044Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5162321Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5162469Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5162743Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5162892Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5163167Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5163328Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5163608Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5163756Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5164218Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3456106496. 2025-12-04T11:58:24.5164334Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5164531Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5164883Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5164998Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5165209Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5165374Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.5165413Z dist init r=3, world=4 2025-12-04T11:58:24.5165451Z FAILED [8.5139s] [100%] 2025-12-04T11:58:24.5165453Z 2025-12-04T11:58:24.5165511Z =================================== FAILURES =================================== 2025-12-04T11:58:24.5165607Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda _________ 2025-12-04T11:58:24.5165654Z Traceback (most recent call last): 2025-12-04T11:58:24.5165816Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.5165879Z self._join_processes(fn) 2025-12-04T11:58:24.5166052Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.5166107Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.5166287Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.5166331Z raise RuntimeError(error) 2025-12-04T11:58:24.5166413Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.5166458Z Traceback (most recent call last): 2025-12-04T11:58:24.5166619Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5166661Z getattr(self, test_name)() 2025-12-04T11:58:24.5166821Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5166856Z fn() 2025-12-04T11:58:24.5167008Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5167048Z method(*args, **kwargs) 2025-12-04T11:58:24.5167219Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5167259Z method(*args, **kwargs) 2025-12-04T11:58:24.5167409Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5167446Z with policy(): 2025-12-04T11:58:24.5167599Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5167639Z raise RuntimeError(msg) 2025-12-04T11:58:24.5167979Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5167981Z 2025-12-04T11:58:24.5168056Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5168335Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5168337Z 2025-12-04T11:58:24.5168426Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5168429Z 2025-12-04T11:58:24.5168430Z 2025-12-04T11:58:24.5168505Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.5168594Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.5168844Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-40549f4e9028d159.xml - 2025-12-04T11:58:24.5168905Z =========================== short test summary info ============================ 2025-12-04T11:58:24.5169146Z FAILED [8.5139s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.5169195Z Traceback (most recent call last): 2025-12-04T11:58:24.5169357Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5169401Z getattr(self, test_name)() 2025-12-04T11:58:24.5169558Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5169594Z fn() 2025-12-04T11:58:24.5169783Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5169824Z method(*args, **kwargs) 2025-12-04T11:58:24.5169973Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5170015Z method(*args, **kwargs) 2025-12-04T11:58:24.5170164Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5170201Z with policy(): 2025-12-04T11:58:24.5170356Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5170396Z raise RuntimeError(msg) 2025-12-04T11:58:24.5170739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5170742Z 2025-12-04T11:58:24.5170814Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5171037Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5171071Z 2025-12-04T11:58:24.5171158Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5171221Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.5171283Z ======================= 1 failed, 7 deselected in 8.52s ======================== 2025-12-04T11:58:24.5171321Z Got exit code 1 2025-12-04T11:58:24.5171493Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda 2025-12-04T11:58:24.5171623Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:58:24.5171830Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-2e453835a51e706a.xml 2025-12-04T11:58:24.5171888Z ============================= test session starts ============================== 2025-12-04T11:58:24.5172002Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.5172043Z cachedir: .pytest_cache 2025-12-04T11:58:24.5172200Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.5172246Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.5172287Z configfile: pytest.ini 2025-12-04T11:58:24.5172449Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.5172521Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.5172576Z stepcurrent: skipping 7 already run items. 2025-12-04T11:58:24.5172621Z Running 1 items in this shard 2025-12-04T11:58:24.5172623Z 2025-12-04T11:58:24.5172926Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda I1204 11:57:50.492000 360130 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 360199 2025-12-04T11:58:24.5173082Z I1204 11:57:50.493000 360130 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 360200 2025-12-04T11:58:24.5173234Z I1204 11:57:50.494000 360130 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 360201 2025-12-04T11:58:24.5173386Z I1204 11:57:50.494000 360130 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 360202 2025-12-04T11:58:24.5173899Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5173963Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5174454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5174514Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5175005Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5175085Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5175569Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5175627Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5175771Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5175936Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5176224Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5176381Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5176666Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5176792Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5177073Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5177222Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5177501Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5177649Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5177944Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5178081Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5178409Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5178560Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5179022Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5179140Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5179335Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5179727Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5179842Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5180053Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5180219Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.5180259Z dist init r=2, world=4 2025-12-04T11:58:24.5180397Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5180557Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5180843Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5180997Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5181283Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5181407Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5181686Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5181835Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5182109Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5182288Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5182563Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5182702Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5182979Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5183128Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5183591Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:58:24.5183730Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5183927Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5184279Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5184395Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5184606Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5184769Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.5184811Z dist init r=0, world=4 2025-12-04T11:58:24.5184948Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5185107Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5185394Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5185549Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5185833Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5185959Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5186237Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5186384Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5186678Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5186825Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5187101Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5187236Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5187515Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5187663Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5188127Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5188301Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5188498Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5188852Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5188965Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5189177Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5189341Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.5189379Z dist init r=1, world=4 2025-12-04T11:58:24.5189516Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5189676Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5189962Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5190117Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5190401Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5190524Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5190834Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5190983Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5191259Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5191407Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5191681Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5191819Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5192094Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5192275Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5192735Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2243952640 and is now 3456106496. 2025-12-04T11:58:24.5192849Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5193047Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5193397Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5193514Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5193724Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5193888Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.5193927Z dist init r=3, world=4 2025-12-04T11:58:24.5193967Z FAILED [8.5130s] [100%] 2025-12-04T11:58:24.5193969Z 2025-12-04T11:58:24.5194025Z =================================== FAILURES =================================== 2025-12-04T11:58:24.5194122Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda _________ 2025-12-04T11:58:24.5194170Z Traceback (most recent call last): 2025-12-04T11:58:24.5194332Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.5194376Z self._join_processes(fn) 2025-12-04T11:58:24.5194549Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.5194603Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.5194781Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.5194854Z raise RuntimeError(error) 2025-12-04T11:58:24.5194935Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.5194981Z Traceback (most recent call last): 2025-12-04T11:58:24.5195142Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5195187Z getattr(self, test_name)() 2025-12-04T11:58:24.5195345Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5195380Z fn() 2025-12-04T11:58:24.5195531Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5195572Z method(*args, **kwargs) 2025-12-04T11:58:24.5195721Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5195763Z method(*args, **kwargs) 2025-12-04T11:58:24.5195913Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5195950Z with policy(): 2025-12-04T11:58:24.5196101Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5196164Z raise RuntimeError(msg) 2025-12-04T11:58:24.5196499Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:58:24.5196502Z 2025-12-04T11:58:24.5196577Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5196801Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5196803Z 2025-12-04T11:58:24.5196891Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5196893Z 2025-12-04T11:58:24.5196895Z 2025-12-04T11:58:24.5196971Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.5197058Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.5197305Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-2e453835a51e706a.xml - 2025-12-04T11:58:24.5197366Z =========================== short test summary info ============================ 2025-12-04T11:58:24.5197611Z FAILED [8.5130s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:58:24.5197660Z Traceback (most recent call last): 2025-12-04T11:58:24.5197823Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5197866Z getattr(self, test_name)() 2025-12-04T11:58:24.5198026Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5198061Z fn() 2025-12-04T11:58:24.5198254Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5198296Z method(*args, **kwargs) 2025-12-04T11:58:24.5198445Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5198486Z method(*args, **kwargs) 2025-12-04T11:58:24.5198670Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5198708Z with policy(): 2025-12-04T11:58:24.5198859Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5198901Z raise RuntimeError(msg) 2025-12-04T11:58:24.5199238Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:58:24.5199241Z 2025-12-04T11:58:24.5199316Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5199538Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5199541Z 2025-12-04T11:58:24.5199629Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5199693Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.5199755Z ======================= 1 failed, 7 deselected in 8.52s ======================== 2025-12-04T11:58:24.5199794Z Got exit code 1 2025-12-04T11:58:24.5199869Z Retrying single test... 2025-12-04T11:58:24.5200077Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-e1a51fb950d40eb7.xml 2025-12-04T11:58:24.5200135Z ============================= test session starts ============================== 2025-12-04T11:58:24.5200248Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.5200289Z cachedir: .pytest_cache 2025-12-04T11:58:24.5200447Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.5200495Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.5200536Z configfile: pytest.ini 2025-12-04T11:58:24.5200697Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.5200769Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.5200988Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5201034Z Running 1 items in this shard 2025-12-04T11:58:24.5201036Z 2025-12-04T11:58:24.5201336Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda I1204 11:58:01.833000 360524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 360593 2025-12-04T11:58:24.5201490Z I1204 11:58:01.833000 360524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 360594 2025-12-04T11:58:24.5201645Z I1204 11:58:01.834000 360524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 360595 2025-12-04T11:58:24.5201794Z I1204 11:58:01.835000 360524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 360596 2025-12-04T11:58:24.5202294Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5202355Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5202862Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5202922Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5203407Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5203466Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5203950Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5204007Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5204169Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5204332Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5204622Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5204778Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5205064Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5205190Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5205467Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5205614Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5205892Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5206038Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5206314Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5206453Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5206734Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5206901Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5207365Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5207483Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5207680Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5208033Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5208179Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5208390Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5208588Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.5208627Z dist init r=1, world=4 2025-12-04T11:58:24.5208765Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5208924Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5209213Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5209367Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5209653Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5209778Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5210052Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5210201Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5210475Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5210624Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5210897Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5211034Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5211353Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5211501Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5211964Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:58:24.5212079Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5212276Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5212626Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5212761Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5212971Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5213135Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.5213272Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5213434Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5213722Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5213877Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5214160Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5214285Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5214562Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5214709Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5214986Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5215132Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5215425Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5215563Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5215843Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5215993Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5216454Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2243952640 and is now 3456106496. 2025-12-04T11:58:24.5216569Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5216764Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5217952Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5218066Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5218319Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5218485Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.5218524Z dist init r=0, world=4 2025-12-04T11:58:24.5218562Z dist init r=3, world=4 2025-12-04T11:58:24.5218699Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5218860Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5219146Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5219300Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5219585Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5219709Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5219986Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5220134Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5220410Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5220592Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5220866Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5221004Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5221284Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5221432Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5221895Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5222039Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5222235Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5222584Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5222700Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5222911Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5223076Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.5223114Z dist init r=2, world=4 2025-12-04T11:58:24.5223152Z FAILED [8.3134s] [100%] 2025-12-04T11:58:24.5223154Z 2025-12-04T11:58:24.5223211Z =================================== FAILURES =================================== 2025-12-04T11:58:24.5223308Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda _________ 2025-12-04T11:58:24.5223355Z Traceback (most recent call last): 2025-12-04T11:58:24.5223517Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.5223563Z self._join_processes(fn) 2025-12-04T11:58:24.5223735Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.5223790Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.5223968Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.5224013Z raise RuntimeError(error) 2025-12-04T11:58:24.5224093Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.5224139Z Traceback (most recent call last): 2025-12-04T11:58:24.5224300Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5224342Z getattr(self, test_name)() 2025-12-04T11:58:24.5224522Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5224558Z fn() 2025-12-04T11:58:24.5224709Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5224751Z method(*args, **kwargs) 2025-12-04T11:58:24.5224903Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5224942Z method(*args, **kwargs) 2025-12-04T11:58:24.5225093Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5225129Z with policy(): 2025-12-04T11:58:24.5225282Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5225322Z raise RuntimeError(msg) 2025-12-04T11:58:24.5225663Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5225665Z 2025-12-04T11:58:24.5225740Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5225991Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5225993Z 2025-12-04T11:58:24.5226081Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5226083Z 2025-12-04T11:58:24.5226084Z 2025-12-04T11:58:24.5226160Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.5226249Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.5226499Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-e1a51fb950d40eb7.xml - 2025-12-04T11:58:24.5226560Z =========================== short test summary info ============================ 2025-12-04T11:58:24.5226799Z FAILED [8.3134s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.5226848Z Traceback (most recent call last): 2025-12-04T11:58:24.5227010Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5227053Z getattr(self, test_name)() 2025-12-04T11:58:24.5227211Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5227246Z fn() 2025-12-04T11:58:24.5227398Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5227439Z method(*args, **kwargs) 2025-12-04T11:58:24.5227587Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5227629Z method(*args, **kwargs) 2025-12-04T11:58:24.5227779Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5227817Z with policy(): 2025-12-04T11:58:24.5227968Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5228010Z raise RuntimeError(msg) 2025-12-04T11:58:24.5228427Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5228431Z 2025-12-04T11:58:24.5228504Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5228727Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5228730Z 2025-12-04T11:58:24.5228817Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5228880Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.5228941Z ======================= 1 failed, 7 deselected in 8.32s ======================== 2025-12-04T11:58:24.5228979Z Got exit code 1 2025-12-04T11:58:24.5229019Z Retrying single test... 2025-12-04T11:58:24.5229225Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-94eafe77b3573973.xml 2025-12-04T11:58:24.5229283Z ============================= test session starts ============================== 2025-12-04T11:58:24.5229395Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.5229436Z cachedir: .pytest_cache 2025-12-04T11:58:24.5229594Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.5229679Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.5229721Z configfile: pytest.ini 2025-12-04T11:58:24.5229883Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.5229957Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:58:24.5230176Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5230222Z Running 1 items in this shard 2025-12-04T11:58:24.5230226Z 2025-12-04T11:58:24.5230528Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda I1204 11:58:12.936000 360918 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 360987 2025-12-04T11:58:24.5230684Z I1204 11:58:12.936000 360918 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 360988 2025-12-04T11:58:24.5230837Z I1204 11:58:12.937000 360918 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 360989 2025-12-04T11:58:24.5230987Z I1204 11:58:12.938000 360918 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 360990 2025-12-04T11:58:24.5231486Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5231547Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5232036Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5232096Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5232601Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5232661Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5233146Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:58:24.5233205Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:58:24.5233349Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5233511Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5233802Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5233979Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5234264Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5234389Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5234669Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5234818Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5235097Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5235244Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5235519Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5235657Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5235934Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5236087Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5236553Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5236669Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5236883Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5237239Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5237354Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5237565Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5237729Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:58:24.5237770Z dist init r=1, world=4 2025-12-04T11:58:24.5237908Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5238067Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5238407Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5238562Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5238847Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5238972Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5239248Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5239398Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5239674Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5239821Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5240098Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5240233Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5240512Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5240659Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5241151Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5241267Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5241465Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5241819Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5241933Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5242146Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5242311Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:58:24.5242390Z dist init r=2, world=4 2025-12-04T11:58:24.5242527Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5242687Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5242975Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5243129Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5243413Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5243538Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5243813Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5243962Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5244241Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5244389Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5244664Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5244801Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5245077Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5245246Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5245706Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:58:24.5245822Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5246019Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5246372Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5246486Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5246696Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5246880Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:58:24.5246918Z dist init r=0, world=4 2025-12-04T11:58:24.5247056Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:58:24.5247214Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:58:24.5247503Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5247657Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:58:24.5247943Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5248067Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:58:24.5248381Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5248531Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5248805Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5248953Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:58:24.5249228Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5249364Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:58:24.5249676Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5249824Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:58:24.5250286Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:58:24.5250400Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5250598Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5250949Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5251096Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:58:24.5251308Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5251471Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:58:24.5251510Z dist init r=3, world=4 2025-12-04T11:58:24.5251548Z FAILED [8.2126s] [100%] 2025-12-04T11:58:24.5251550Z 2025-12-04T11:58:24.5251608Z =================================== FAILURES =================================== 2025-12-04T11:58:24.5251705Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda _________ 2025-12-04T11:58:24.5251752Z Traceback (most recent call last): 2025-12-04T11:58:24.5251915Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:58:24.5251960Z self._join_processes(fn) 2025-12-04T11:58:24.5252133Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:58:24.5252188Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:58:24.5252365Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:58:24.5252409Z raise RuntimeError(error) 2025-12-04T11:58:24.5252492Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.5252537Z Traceback (most recent call last): 2025-12-04T11:58:24.5252700Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5252743Z getattr(self, test_name)() 2025-12-04T11:58:24.5252902Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5252936Z fn() 2025-12-04T11:58:24.5253088Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5253128Z method(*args, **kwargs) 2025-12-04T11:58:24.5253278Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5253318Z method(*args, **kwargs) 2025-12-04T11:58:24.5253487Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5253524Z with policy(): 2025-12-04T11:58:24.5253675Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5253717Z raise RuntimeError(msg) 2025-12-04T11:58:24.5254053Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5254056Z 2025-12-04T11:58:24.5254131Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5254355Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5254357Z 2025-12-04T11:58:24.5254447Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5254449Z 2025-12-04T11:58:24.5254508Z Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.5254555Z Traceback (most recent call last): 2025-12-04T11:58:24.5254739Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5254782Z getattr(self, test_name)() 2025-12-04T11:58:24.5254940Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5254975Z fn() 2025-12-04T11:58:24.5255125Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5255166Z method(*args, **kwargs) 2025-12-04T11:58:24.5255317Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5255357Z method(*args, **kwargs) 2025-12-04T11:58:24.5255506Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5255543Z with policy(): 2025-12-04T11:58:24.5255695Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5255736Z raise RuntimeError(msg) 2025-12-04T11:58:24.5256071Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5256073Z 2025-12-04T11:58:24.5256148Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5256371Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5256373Z 2025-12-04T11:58:24.5256460Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5256462Z 2025-12-04T11:58:24.5256466Z 2025-12-04T11:58:24.5256541Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:58:24.5256628Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:58:24.5256876Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-94eafe77b3573973.xml - 2025-12-04T11:58:24.5256936Z =========================== short test summary info ============================ 2025-12-04T11:58:24.5257203Z FAILED [8.2126s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:58:24.5257251Z Traceback (most recent call last): 2025-12-04T11:58:24.5257417Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5257458Z getattr(self, test_name)() 2025-12-04T11:58:24.5257620Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5257654Z fn() 2025-12-04T11:58:24.5257805Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5257846Z method(*args, **kwargs) 2025-12-04T11:58:24.5257996Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5258035Z method(*args, **kwargs) 2025-12-04T11:58:24.5258225Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5258263Z with policy(): 2025-12-04T11:58:24.5258414Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5258490Z raise RuntimeError(msg) 2025-12-04T11:58:24.5258827Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:58:24.5258829Z 2025-12-04T11:58:24.5258903Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5259125Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5259127Z 2025-12-04T11:58:24.5259216Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5259218Z 2025-12-04T11:58:24.5259276Z Process 2 exited with error code 10 and exception: 2025-12-04T11:58:24.5259323Z Traceback (most recent call last): 2025-12-04T11:58:24.5259486Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:58:24.5259528Z getattr(self, test_name)() 2025-12-04T11:58:24.5259688Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:58:24.5259722Z fn() 2025-12-04T11:58:24.5259873Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5259912Z method(*args, **kwargs) 2025-12-04T11:58:24.5260061Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:58:24.5260101Z method(*args, **kwargs) 2025-12-04T11:58:24.5260251Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:58:24.5260287Z with policy(): 2025-12-04T11:58:24.5260438Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:58:24.5260481Z raise RuntimeError(msg) 2025-12-04T11:58:24.5260816Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:58:24.5260818Z 2025-12-04T11:58:24.5260890Z To execute this test, run the following from the base repo dir: 2025-12-04T11:58:24.5261142Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5261144Z 2025-12-04T11:58:24.5261231Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:58:24.5261294Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:58:24.5261357Z ======================= 1 failed, 7 deselected in 8.22s ======================== 2025-12-04T11:58:24.5261395Z Got exit code 1 2025-12-04T11:58:24.5261569Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda 2025-12-04T11:58:24.5261697Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:58:24.5261903Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-8433612606330e98.xml 2025-12-04T11:58:24.5261962Z ============================= test session starts ============================== 2025-12-04T11:58:24.5262074Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T11:58:24.5262115Z cachedir: .pytest_cache 2025-12-04T11:58:24.5262272Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:58:24.5262340Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:58:24.5262381Z configfile: pytest.ini 2025-12-04T11:58:24.5262542Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:58:24.5262614Z collecting ... collected 8 items / 8 deselected / 0 selected 2025-12-04T11:58:24.5262666Z stepcurrent: skipping 8 already run items. 2025-12-04T11:58:24.5262711Z Running 0 items in this shard 2025-12-04T11:58:24.5262713Z 2025-12-04T11:58:24.5262959Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-8433612606330e98.xml - 2025-12-04T11:58:24.5263018Z ============================ 8 deselected in 0.00s ============================= 2025-12-04T11:58:24.5264519Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda'] 2025-12-04T11:58:24.5264525Z 2025-12-04T11:58:24.5264726Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_exec_order 1/1 (test/test-reports/distributed.fsdp.test_fsdp_exec_order_1.1_e994e873868c2dab_.log) 2025-12-04T11:58:24.5264728Z 2025-12-04T11:58:24.5264859Z Finished distributed/fsdp/test_fsdp_exec_order 1/1 ... [2025-12-04 11:58:24.429614][2289003.078794959], took 4.36min 2025-12-04T11:58:24.5265122Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:58:24.5265227Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:58:24.5265324Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T11:58:24.5265372Z Uploading artifacts took 0.00 seconds 2025-12-04T11:58:24.5265433Z distributed/fsdp/test_fsdp_exec_order 1/1 failed! 2025-12-04T11:58:24.5265556Z Running distributed/fsdp/test_fsdp_flatten_params 1/1 ... [2025-12-04 11:58:24.432583][2289003.08176702] 2025-12-04T11:58:24.5265605Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:58:24.5265934Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_flatten_params.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:58:24.432769] 2025-12-04T11:59:27.4920090Z 2025-12-04T11:59:27.4921132Z distributed/fsdp/test_fsdp_flatten_params 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_flatten_params_1.1_bf7ca175952f8a78_.log 2025-12-04T11:59:27.4926069Z Running 14 items in this shard: test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_empty_module, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_aligned_full_precision, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_aligned_mixed_precision, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_unaligned, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_with_memory_format_memory_format0, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_with_memory_format_memory_format1, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flatten_nothing, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_numel_with_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_numel_without_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_output_with_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_output_without_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_partial_flattening, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_pnorm_after_step_with_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_writeback_orig_params_no_shard 2025-12-04T11:59:27.4930878Z 2025-12-04T11:59:27.4931121Z Finished distributed/fsdp/test_fsdp_flatten_params 1/1 ... [2025-12-04 11:59:27.491672][2289066.140851202], took 1.05min 2025-12-04T11:59:27.4935288Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T11:59:27.4952281Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:59:27.4955011Z Running distributed/test_distributed_spawn 3/7 ... [2025-12-04 11:59:27.495347][2289066.144531064] 2025-12-04T11:59:27.4956402Z MPI not available -- MPI backend tests will be skipped 2025-12-04T11:59:27.4956793Z Running distributed tests for the test backend with env init_method 2025-12-04T11:59:27.4958489Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:59:27.4960011Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:59:27.495834] 2025-12-04T11:59:29.5222749Z 2025-12-04T11:59:29.5224020Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_a71a9c699ade0e28_.log 2025-12-04T11:59:29.5224391Z Running 0 items in this shard: 2025-12-04T11:59:29.5224479Z 2025-12-04T11:59:29.5228826Z Running distributed tests for the test backend with file init_method 2025-12-04T11:59:29.5231631Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:59:29.5232667Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:59:29.523057] 2025-12-04T11:59:31.4789067Z 2025-12-04T11:59:31.4790202Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_6b98ab0038112441_.log 2025-12-04T11:59:31.4790868Z Running 0 items in this shard: 2025-12-04T11:59:31.4791023Z 2025-12-04T11:59:31.4795732Z Running distributed tests for the nccl backend with env init_method 2025-12-04T11:59:31.4798529Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:59:31.4799563Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:59:31.479709] 2025-12-04T12:02:39.4508102Z 2025-12-04T12:02:39.4509129Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_a149f9d8bf39377a_.log 2025-12-04T12:02:39.4520767Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T12:02:39.4529682Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2025-12-04T12:02:39.4530245Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream 2025-12-04T12:02:39.4530738Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex 2025-12-04T12:02:39.4531232Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2025-12-04T12:02:39.4531723Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2025-12-04T12:02:39.4532171Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex 2025-12-04T12:02:39.4532651Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda 2025-12-04T12:02:39.4533168Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2025-12-04T12:02:39.4533689Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2025-12-04T12:02:39.4534204Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl 2025-12-04T12:02:39.4534679Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group 2025-12-04T12:02:39.4535128Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2025-12-04T12:02:39.4535585Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2025-12-04T12:02:39.4536110Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger 2025-12-04T12:02:39.4536664Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future 2025-12-04T12:02:39.4537220Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2025-12-04T12:02:39.4537676Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2025-12-04T12:02:39.4538044Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference 2025-12-04T12:02:39.4538459Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks 2025-12-04T12:02:39.4538855Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler 2025-12-04T12:02:39.4539234Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features 2025-12-04T12:02:39.4539580Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend 2025-12-04T12:02:39.4539921Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler 2025-12-04T12:02:39.4540300Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2025-12-04T12:02:39.4540738Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather 2025-12-04T12:02:39.4541116Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream 2025-12-04T12:02:39.4541475Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2025-12-04T12:02:39.4541844Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module 2025-12-04T12:02:39.4542240Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity 2025-12-04T12:02:39.4542616Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min 2025-12-04T12:02:39.4542976Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product 2025-12-04T12:02:39.4543322Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min 2025-12-04T12:02:39.4543676Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda 2025-12-04T12:02:39.4544053Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler 2025-12-04T12:02:39.4544416Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2025-12-04T12:02:39.4544788Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T12:02:39.4545008Z 2025-12-04T12:02:39.4545098Z Running distributed tests for the nccl backend with file init_method 2025-12-04T12:02:39.4545277Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:02:39.4545717Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:02:39.452037] 2025-12-04T12:05:47.4333374Z 2025-12-04T12:05:47.4334245Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_88b49a8a749fb92c_.log 2025-12-04T12:05:47.4345649Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T12:05:47.4354489Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2025-12-04T12:05:47.4355019Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream 2025-12-04T12:05:47.4355549Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex 2025-12-04T12:05:47.4356021Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2025-12-04T12:05:47.4356498Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2025-12-04T12:05:47.4356929Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex 2025-12-04T12:05:47.4357382Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda 2025-12-04T12:05:47.4357865Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2025-12-04T12:05:47.4358419Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2025-12-04T12:05:47.4358900Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl 2025-12-04T12:05:47.4359395Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group 2025-12-04T12:05:47.4359818Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2025-12-04T12:05:47.4360250Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2025-12-04T12:05:47.4360751Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger 2025-12-04T12:05:47.4361279Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future 2025-12-04T12:05:47.4361728Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2025-12-04T12:05:47.4362160Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2025-12-04T12:05:47.4362583Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference 2025-12-04T12:05:47.4363025Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks 2025-12-04T12:05:47.4363494Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler 2025-12-04T12:05:47.4363891Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features 2025-12-04T12:05:47.4364230Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend 2025-12-04T12:05:47.4364564Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler 2025-12-04T12:05:47.4364940Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2025-12-04T12:05:47.4365318Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather 2025-12-04T12:05:47.4365685Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream 2025-12-04T12:05:47.4366033Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2025-12-04T12:05:47.4366439Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module 2025-12-04T12:05:47.4366825Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity 2025-12-04T12:05:47.4367195Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min 2025-12-04T12:05:47.4367547Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product 2025-12-04T12:05:47.4367882Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min 2025-12-04T12:05:47.4368397Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda 2025-12-04T12:05:47.4368769Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler 2025-12-04T12:05:47.4369122Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2025-12-04T12:05:47.4369483Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T12:05:47.4369738Z 2025-12-04T12:05:47.4369826Z Running distributed tests for the gloo backend with env init_method 2025-12-04T12:05:47.4369998Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:05:47.4370426Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:05:47.434600] 2025-12-04T12:08:24.4611049Z 2025-12-04T12:08:24.4612231Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_8fc46078fe9fbf29_.log 2025-12-04T12:08:24.4624876Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T12:08:24.4633044Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2025-12-04T12:08:24.4633601Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream 2025-12-04T12:08:24.4634089Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex 2025-12-04T12:08:24.4634578Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2025-12-04T12:08:24.4635063Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2025-12-04T12:08:24.4635505Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex 2025-12-04T12:08:24.4635979Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda 2025-12-04T12:08:24.4636480Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2025-12-04T12:08:24.4636993Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2025-12-04T12:08:24.4637497Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl 2025-12-04T12:08:24.4637959Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group 2025-12-04T12:08:24.4638448Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2025-12-04T12:08:24.4638896Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2025-12-04T12:08:24.4639472Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger 2025-12-04T12:08:24.4640020Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future 2025-12-04T12:08:24.4640417Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2025-12-04T12:08:24.4640774Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2025-12-04T12:08:24.4641128Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference 2025-12-04T12:08:24.4641499Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks 2025-12-04T12:08:24.4641893Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler 2025-12-04T12:08:24.4642265Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features 2025-12-04T12:08:24.4642678Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend 2025-12-04T12:08:24.4643011Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler 2025-12-04T12:08:24.4643379Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2025-12-04T12:08:24.4643760Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather 2025-12-04T12:08:24.4644133Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream 2025-12-04T12:08:24.4644485Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2025-12-04T12:08:24.4644846Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module 2025-12-04T12:08:24.4645236Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity 2025-12-04T12:08:24.4645605Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min 2025-12-04T12:08:24.4645959Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product 2025-12-04T12:08:24.4646296Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min 2025-12-04T12:08:24.4646644Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda 2025-12-04T12:08:24.4647033Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler 2025-12-04T12:08:24.4647387Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2025-12-04T12:08:24.4647749Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T12:08:24.4647964Z 2025-12-04T12:08:24.4648052Z Running distributed tests for the gloo backend with file init_method 2025-12-04T12:08:24.4648269Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:08:24.4648733Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:08:24.462411] 2025-12-04T12:11:02.7453618Z 2025-12-04T12:11:02.7454229Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_3934831f6ebb6547_.log 2025-12-04T12:11:02.7460535Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T12:11:02.7466218Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2025-12-04T12:11:02.7466663Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream 2025-12-04T12:11:02.7467055Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex 2025-12-04T12:11:02.7467444Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2025-12-04T12:11:02.7467832Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2025-12-04T12:11:02.7468218Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex 2025-12-04T12:11:02.7468594Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda 2025-12-04T12:11:02.7469041Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2025-12-04T12:11:02.7469448Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2025-12-04T12:11:02.7469847Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl 2025-12-04T12:11:02.7470219Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group 2025-12-04T12:11:02.7470572Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2025-12-04T12:11:02.7470928Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2025-12-04T12:11:02.7471344Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger 2025-12-04T12:11:02.7471783Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future 2025-12-04T12:11:02.7472159Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2025-12-04T12:11:02.7472515Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2025-12-04T12:11:02.7472869Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference 2025-12-04T12:11:02.7473239Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks 2025-12-04T12:11:02.7473630Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler 2025-12-04T12:11:02.7474001Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features 2025-12-04T12:11:02.7474337Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend 2025-12-04T12:11:02.7474668Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler 2025-12-04T12:11:02.7475040Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2025-12-04T12:11:02.7475455Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather 2025-12-04T12:11:02.7475822Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream 2025-12-04T12:11:02.7476174Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2025-12-04T12:11:02.7476533Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module 2025-12-04T12:11:02.7476921Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity 2025-12-04T12:11:02.7477290Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min 2025-12-04T12:11:02.7477644Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product 2025-12-04T12:11:02.7480494Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min 2025-12-04T12:11:02.7480841Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda 2025-12-04T12:11:02.7481251Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler 2025-12-04T12:11:02.7481602Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2025-12-04T12:11:02.7481963Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters 2025-12-04T12:11:02.7482176Z 2025-12-04T12:11:02.7482313Z Finished distributed/test_distributed_spawn 3/7 ... [2025-12-04 12:11:02.745869][2289761.395046316], took 11.59min 2025-12-04T12:11:02.7482753Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:11:02.7486883Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:11:02.7490669Z Running distributed/test_distributed_spawn 6/7 ... [2025-12-04 12:11:02.748959][2289761.398140065] 2025-12-04T12:11:02.7491132Z MPI not available -- MPI backend tests will be skipped 2025-12-04T12:11:02.7492025Z Running distributed tests for the test backend with env init_method 2025-12-04T12:11:02.7492905Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:11:02.7495027Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:11:02.749373] 2025-12-04T12:11:04.7062323Z 2025-12-04T12:11:04.7063761Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_5a8ddf85e205a8da_.log 2025-12-04T12:11:04.7064760Z Running 0 items in this shard: 2025-12-04T12:11:04.7064990Z 2025-12-04T12:11:04.7065287Z Running distributed tests for the test backend with file init_method 2025-12-04T12:11:04.7065785Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:11:04.7068501Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:11:04.706622] 2025-12-04T12:11:06.6567193Z 2025-12-04T12:11:06.6569194Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_1a03a9b75b486076_.log 2025-12-04T12:11:06.6570130Z Running 0 items in this shard: 2025-12-04T12:11:06.6570365Z 2025-12-04T12:11:06.6573219Z Running distributed tests for the nccl backend with env init_method 2025-12-04T12:11:06.6573943Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:11:06.6576482Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:11:06.657520] 2025-12-04T12:15:07.4004528Z 2025-12-04T12:15:07.4005646Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_3d920147986b72d2_.log 2025-12-04T12:15:07.4022005Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T12:15:07.4032306Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2025-12-04T12:15:07.4032947Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2025-12-04T12:15:07.4033478Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2025-12-04T12:15:07.4033990Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex 2025-12-04T12:15:07.4034425Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group 2025-12-04T12:15:07.4034833Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2025-12-04T12:15:07.4035243Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2025-12-04T12:15:07.4035643Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2025-12-04T12:15:07.4036045Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product 2025-12-04T12:15:07.4036455Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product 2025-12-04T12:15:07.4036883Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda 2025-12-04T12:15:07.4037321Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split 2025-12-04T12:15:07.4037755Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2025-12-04T12:15:07.4038242Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group 2025-12-04T12:15:07.4038650Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group 2025-12-04T12:15:07.4039100Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group 2025-12-04T12:15:07.4039507Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2025-12-04T12:15:07.4039897Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda 2025-12-04T12:15:07.4040270Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group 2025-12-04T12:15:07.4040661Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook 2025-12-04T12:15:07.4041086Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2025-12-04T12:15:07.4041551Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2025-12-04T12:15:07.4042015Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks 2025-12-04T12:15:07.4042461Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2025-12-04T12:15:07.4042879Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs 2025-12-04T12:15:07.4043270Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized 2025-12-04T12:15:07.4043715Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none 2025-12-04T12:15:07.4044203Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2025-12-04T12:15:07.4044659Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none 2025-12-04T12:15:07.4045078Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd 2025-12-04T12:15:07.4045428Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone 2025-12-04T12:15:07.4045777Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2025-12-04T12:15:07.4046125Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2025-12-04T12:15:07.4046465Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks 2025-12-04T12:15:07.4046794Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda 2025-12-04T12:15:07.4047122Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2025-12-04T12:15:07.4047462Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup 2025-12-04T12:15:07.4047827Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2025-12-04T12:15:07.4048283Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size 2025-12-04T12:15:07.4048708Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload 2025-12-04T12:15:07.4049106Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max 2025-12-04T12:15:07.4049445Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2025-12-04T12:15:07.4049798Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T12:15:07.4050002Z 2025-12-04T12:15:07.4050093Z Running distributed tests for the nccl backend with file init_method 2025-12-04T12:15:07.4050263Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:15:07.4050693Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:15:07.401730] 2025-12-04T12:19:07.4006544Z 2025-12-04T12:19:07.4007171Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_78c0cc8c2511fca6_.log 2025-12-04T12:19:07.4014806Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T12:19:07.4021657Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2025-12-04T12:19:07.4022111Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2025-12-04T12:19:07.4022489Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2025-12-04T12:19:07.4022855Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex 2025-12-04T12:19:07.4023224Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group 2025-12-04T12:19:07.4023597Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2025-12-04T12:19:07.4023972Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2025-12-04T12:19:07.4024335Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2025-12-04T12:19:07.4024703Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product 2025-12-04T12:19:07.4025083Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product 2025-12-04T12:19:07.4025481Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda 2025-12-04T12:19:07.4025878Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split 2025-12-04T12:19:07.4026277Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2025-12-04T12:19:07.4026683Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group 2025-12-04T12:19:07.4027101Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group 2025-12-04T12:19:07.4027460Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group 2025-12-04T12:19:07.4027833Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2025-12-04T12:19:07.4028243Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda 2025-12-04T12:19:07.4028581Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group 2025-12-04T12:19:07.4028939Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook 2025-12-04T12:19:07.4029333Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2025-12-04T12:19:07.4029751Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2025-12-04T12:19:07.4030199Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks 2025-12-04T12:19:07.4030604Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2025-12-04T12:19:07.4030972Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs 2025-12-04T12:19:07.4031328Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized 2025-12-04T12:19:07.4031734Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none 2025-12-04T12:19:07.4032182Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2025-12-04T12:19:07.4032626Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none 2025-12-04T12:19:07.4033042Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd 2025-12-04T12:19:07.4033389Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone 2025-12-04T12:19:07.4033735Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2025-12-04T12:19:07.4034084Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2025-12-04T12:19:07.4034421Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks 2025-12-04T12:19:07.4034753Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda 2025-12-04T12:19:07.4035082Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2025-12-04T12:19:07.4035422Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup 2025-12-04T12:19:07.4035785Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2025-12-04T12:19:07.4036199Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size 2025-12-04T12:19:07.4036656Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload 2025-12-04T12:19:07.4037025Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max 2025-12-04T12:19:07.4037386Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2025-12-04T12:19:07.4037737Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T12:19:07.4037940Z 2025-12-04T12:19:07.4038027Z Running distributed tests for the gloo backend with env init_method 2025-12-04T12:19:07.4038242Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:19:07.4038674Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:19:07.401956] 2025-12-04T12:22:43.2467296Z 2025-12-04T12:22:43.2467942Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_7af0010540e7ac65_.log 2025-12-04T12:22:43.2480303Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T12:22:43.2488441Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2025-12-04T12:22:43.2488898Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2025-12-04T12:22:43.2489280Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2025-12-04T12:22:43.2489647Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex 2025-12-04T12:22:43.2490017Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group 2025-12-04T12:22:43.2490386Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2025-12-04T12:22:43.2490763Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2025-12-04T12:22:43.2491149Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2025-12-04T12:22:43.2491516Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product 2025-12-04T12:22:43.2491890Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product 2025-12-04T12:22:43.2492280Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda 2025-12-04T12:22:43.2492676Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split 2025-12-04T12:22:43.2493122Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2025-12-04T12:22:43.2493530Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group 2025-12-04T12:22:43.2493899Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group 2025-12-04T12:22:43.2494254Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group 2025-12-04T12:22:43.2494624Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2025-12-04T12:22:43.2494977Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda 2025-12-04T12:22:43.2495312Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group 2025-12-04T12:22:43.2495670Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook 2025-12-04T12:22:43.2496086Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2025-12-04T12:22:43.2496522Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2025-12-04T12:22:43.2496943Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks 2025-12-04T12:22:43.2497328Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2025-12-04T12:22:43.2497694Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs 2025-12-04T12:22:43.2498050Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized 2025-12-04T12:22:43.2498494Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none 2025-12-04T12:22:43.2498941Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2025-12-04T12:22:43.2499385Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none 2025-12-04T12:22:43.2499803Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd 2025-12-04T12:22:43.2500149Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone 2025-12-04T12:22:43.2500496Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2025-12-04T12:22:43.2500845Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2025-12-04T12:22:43.2501183Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks 2025-12-04T12:22:43.2501510Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda 2025-12-04T12:22:43.2501834Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2025-12-04T12:22:43.2502173Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup 2025-12-04T12:22:43.2502577Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2025-12-04T12:22:43.2502989Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size 2025-12-04T12:22:43.2503416Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload 2025-12-04T12:22:43.2503783Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max 2025-12-04T12:22:43.2504119Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2025-12-04T12:22:43.2504471Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T12:22:43.2504672Z 2025-12-04T12:22:43.2504764Z Running distributed tests for the gloo backend with file init_method 2025-12-04T12:22:43.2504935Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:22:43.2505361Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:22:43.247549] 2025-12-04T12:26:16.0997611Z 2025-12-04T12:26:16.0998489Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_e2aa3221fb17d374_.log 2025-12-04T12:26:16.1012008Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T12:26:16.1021548Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2025-12-04T12:26:16.1022048Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2025-12-04T12:26:16.1022460Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2025-12-04T12:26:16.1022862Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex 2025-12-04T12:26:16.1023262Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group 2025-12-04T12:26:16.1023664Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2025-12-04T12:26:16.1024079Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2025-12-04T12:26:16.1024478Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2025-12-04T12:26:16.1024880Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product 2025-12-04T12:26:16.1025289Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product 2025-12-04T12:26:16.1025722Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda 2025-12-04T12:26:16.1026207Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split 2025-12-04T12:26:16.1026643Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2025-12-04T12:26:16.1027087Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group 2025-12-04T12:26:16.1027493Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group 2025-12-04T12:26:16.1027883Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group 2025-12-04T12:26:16.1028324Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2025-12-04T12:26:16.1028720Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda 2025-12-04T12:26:16.1029087Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group 2025-12-04T12:26:16.1029518Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook 2025-12-04T12:26:16.1029961Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2025-12-04T12:26:16.1030425Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2025-12-04T12:26:16.1030887Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks 2025-12-04T12:26:16.1031310Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2025-12-04T12:26:16.1031674Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs 2025-12-04T12:26:16.1032027Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized 2025-12-04T12:26:16.1032429Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none 2025-12-04T12:26:16.1032869Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2025-12-04T12:26:16.1033313Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none 2025-12-04T12:26:16.1033734Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd 2025-12-04T12:26:16.1034079Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone 2025-12-04T12:26:16.1034426Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2025-12-04T12:26:16.1034774Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2025-12-04T12:26:16.1035110Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks 2025-12-04T12:26:16.1035437Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda 2025-12-04T12:26:16.1035761Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2025-12-04T12:26:16.1036133Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup 2025-12-04T12:26:16.1036498Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2025-12-04T12:26:16.1036913Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size 2025-12-04T12:26:16.1037338Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload 2025-12-04T12:26:16.1037704Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max 2025-12-04T12:26:16.1038040Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2025-12-04T12:26:16.1038433Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward 2025-12-04T12:26:16.1038636Z 2025-12-04T12:26:16.1038809Z Finished distributed/test_distributed_spawn 6/7 ... [2025-12-04 12:26:16.100672][2290674.749850741], took 15.22min 2025-12-04T12:26:16.1039258Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:26:16.1039658Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:26:16.1039878Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:26:16.1040059Z Uploading artifacts took 0.00 seconds 2025-12-04T12:26:16.1041578Z Running distributed/fsdp/test_fsdp_traversal 1/1 ... [2025-12-04 12:26:16.104064][2290674.753248032] 2025-12-04T12:26:16.1041791Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:26:16.1043134Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_traversal.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:26:16.104229] 2025-12-04T12:26:41.5012658Z 2025-12-04T12:26:41.5013381Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_traversal 1/1 (test/test-reports/distributed.fsdp.test_fsdp_traversal_1.1_ef9ad764013e9636_.log) 2025-12-04T12:26:41.5014017Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-fdadd662e8b4052c.xml 2025-12-04T12:26:41.5014444Z ============================= test session starts ============================== 2025-12-04T12:26:41.5014781Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:26:41.5015043Z cachedir: .pytest_cache 2025-12-04T12:26:41.5015362Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:26:41.5015696Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:26:41.5015864Z configfile: pytest.ini 2025-12-04T12:26:41.5016210Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:26:41.5016621Z collecting ... collected 1 item 2025-12-04T12:26:41.5016813Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:26:41.5017178Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda 2025-12-04T12:26:41.5017432Z 2025-12-04T12:26:41.5017811Z distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda I1204 12:26:17.817000 448654 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 448723 2025-12-04T12:26:41.5019017Z I1204 12:26:17.818000 448654 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 448724 2025-12-04T12:26:41.5019475Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:26:41.5019940Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:26:41.5020598Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5021302Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:26:41.5021978Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5022545Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:26:41.5023190Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5023826Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5024412Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5024993Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5025579Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5026149Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:26:41.5026721Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5027305Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:26:41.5028095Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:26:41.5028882Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5029323Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5030019Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5030610Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5031118Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5031543Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:26:41.5031792Z dist init r=1, world=2 2025-12-04T12:26:41.5032001Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:26:41.5032372Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:26:41.5032904Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5033432Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:26:41.5033918Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5034415Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:26:41.5034858Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5035327Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5035801Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5036267Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5036734Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5037194Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:26:41.5037653Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5038123Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:26:41.5038791Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:26:41.5039379Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5039731Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5040286Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5040790Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5041155Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5041569Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:26:41.5041809Z dist init r=0, world=2 2025-12-04T12:26:41.5042227Z [rank0]:[W1204 12:26:21.509278405 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:26:41.5042650Z FAILED [5.2094s] [100%] 2025-12-04T12:26:41.5042715Z 2025-12-04T12:26:41.5042781Z =================================== FAILURES =================================== 2025-12-04T12:26:41.5042967Z ___________________ TestTraversalCUDA.test_fsdp_modules_cuda ___________________ 2025-12-04T12:26:41.5043161Z Traceback (most recent call last): 2025-12-04T12:26:41.5043409Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:26:41.5043730Z self._join_processes(fn) 2025-12-04T12:26:41.5043978Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:26:41.5044242Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:26:41.5044507Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:26:41.5044765Z raise RuntimeError(error) 2025-12-04T12:26:41.5044917Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:26:41.5045081Z Traceback (most recent call last): 2025-12-04T12:26:41.5045320Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5045562Z getattr(self, test_name)() 2025-12-04T12:26:41.5045794Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5046026Z fn() 2025-12-04T12:26:41.5046229Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5046460Z method(*args, **kwargs) 2025-12-04T12:26:41.5046681Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5046910Z method(*args, **kwargs) 2025-12-04T12:26:41.5047127Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5047355Z with policy(): 2025-12-04T12:26:41.5047567Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5047798Z raise RuntimeError(msg) 2025-12-04T12:26:41.5048218Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:26:41.5048560Z 2025-12-04T12:26:41.5048639Z To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5048945Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5049175Z 2025-12-04T12:26:41.5049267Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5049393Z 2025-12-04T12:26:41.5049395Z 2025-12-04T12:26:41.5049522Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:26:41.5049728Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:26:41.5050106Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-fdadd662e8b4052c.xml - 2025-12-04T12:26:41.5050455Z =========================== short test summary info ============================ 2025-12-04T12:26:41.5050769Z FAILED [5.2094s] distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:26:41.5051060Z Traceback (most recent call last): 2025-12-04T12:26:41.5051305Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5051547Z getattr(self, test_name)() 2025-12-04T12:26:41.5051781Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5052077Z fn() 2025-12-04T12:26:41.5052278Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5052532Z method(*args, **kwargs) 2025-12-04T12:26:41.5052750Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5052976Z method(*args, **kwargs) 2025-12-04T12:26:41.5053191Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5053415Z with policy(): 2025-12-04T12:26:41.5053625Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5053854Z raise RuntimeError(msg) 2025-12-04T12:26:41.5054303Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:26:41.5054649Z 2025-12-04T12:26:41.5054724Z To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5055028Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5055255Z 2025-12-04T12:26:41.5055344Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5055565Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:26:41.5055723Z ============================== 1 failed in 5.35s =============================== 2025-12-04T12:26:41.5055855Z Got exit code 1 2025-12-04T12:26:41.5055956Z Retrying single test... 2025-12-04T12:26:41.5056224Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-168649b91fa6c9f9.xml 2025-12-04T12:26:41.5056523Z ============================= test session starts ============================== 2025-12-04T12:26:41.5056735Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:26:41.5056922Z cachedir: .pytest_cache 2025-12-04T12:26:41.5057145Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:26:41.5057382Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:26:41.5057499Z configfile: pytest.ini 2025-12-04T12:26:41.5057725Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:26:41.5057966Z collecting ... collected 1 item 2025-12-04T12:26:41.5058293Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda 2025-12-04T12:26:41.5058556Z Running 1 items in this shard 2025-12-04T12:26:41.5058628Z 2025-12-04T12:26:41.5058901Z distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda I1204 12:26:25.424000 448882 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 448951 2025-12-04T12:26:41.5059366Z I1204 12:26:25.424000 448882 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 448952 2025-12-04T12:26:41.5059699Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:26:41.5060040Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:26:41.5060533Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5061028Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:26:41.5061526Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5061973Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:26:41.5062420Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5062889Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5063352Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5063814Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5064275Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5064725Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:26:41.5065180Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5065645Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:26:41.5066271Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:26:41.5066852Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5067225Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5067776Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5068292Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5068658Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5069072Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:26:41.5069311Z dist init r=0, world=2 2025-12-04T12:26:41.5069518Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:26:41.5069855Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:26:41.5070357Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5070854Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:26:41.5071332Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5071778Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:26:41.5072218Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5072682Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5073147Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5073612Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5074073Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5074524Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:26:41.5074980Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5075445Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:26:41.5076066Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:26:41.5076673Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5077030Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5077583Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5078053Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5078461Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5078883Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:26:41.5079127Z dist init r=1, world=2 2025-12-04T12:26:41.5079547Z [rank0]:[W1204 12:26:29.077781766 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:26:41.5079977Z FAILED [5.2096s] [100%] 2025-12-04T12:26:41.5080040Z 2025-12-04T12:26:41.5080099Z =================================== FAILURES =================================== 2025-12-04T12:26:41.5080282Z ___________________ TestTraversalCUDA.test_fsdp_modules_cuda ___________________ 2025-12-04T12:26:41.5080455Z Traceback (most recent call last): 2025-12-04T12:26:41.5080709Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:26:41.5080959Z self._join_processes(fn) 2025-12-04T12:26:41.5081214Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:26:41.5081487Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:26:41.5081763Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:26:41.5082031Z raise RuntimeError(error) 2025-12-04T12:26:41.5082191Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:26:41.5082360Z Traceback (most recent call last): 2025-12-04T12:26:41.5082607Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5082857Z getattr(self, test_name)() 2025-12-04T12:26:41.5083097Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5083335Z fn() 2025-12-04T12:26:41.5083546Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5083786Z method(*args, **kwargs) 2025-12-04T12:26:41.5084015Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5084255Z method(*args, **kwargs) 2025-12-04T12:26:41.5084480Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5084712Z with policy(): 2025-12-04T12:26:41.5084931Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5085169Z raise RuntimeError(msg) 2025-12-04T12:26:41.5085590Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:26:41.5085935Z 2025-12-04T12:26:41.5086014Z To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5086322Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5086552Z 2025-12-04T12:26:41.5086642Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5086767Z 2025-12-04T12:26:41.5086769Z 2025-12-04T12:26:41.5086849Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:26:41.5087052Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:26:41.5087426Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-168649b91fa6c9f9.xml - 2025-12-04T12:26:41.5087779Z =========================== short test summary info ============================ 2025-12-04T12:26:41.5088110Z FAILED [5.2096s] distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:26:41.5088547Z Traceback (most recent call last): 2025-12-04T12:26:41.5088827Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5089350Z getattr(self, test_name)() 2025-12-04T12:26:41.5089628Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5089895Z fn() 2025-12-04T12:26:41.5090164Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5090431Z method(*args, **kwargs) 2025-12-04T12:26:41.5090696Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5090992Z method(*args, **kwargs) 2025-12-04T12:26:41.5091249Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5091529Z with policy(): 2025-12-04T12:26:41.5091786Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5092057Z raise RuntimeError(msg) 2025-12-04T12:26:41.5092486Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:26:41.5092850Z 2025-12-04T12:26:41.5092937Z To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5093275Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5093551Z 2025-12-04T12:26:41.5093649Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5093884Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:26:41.5094085Z ============================== 1 failed in 5.36s =============================== 2025-12-04T12:26:41.5094263Z Got exit code 1 2025-12-04T12:26:41.5094404Z Retrying single test... 2025-12-04T12:26:41.5094736Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-c6de5f5f6db77275.xml 2025-12-04T12:26:41.5095138Z ============================= test session starts ============================== 2025-12-04T12:26:41.5095412Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:26:41.5095670Z cachedir: .pytest_cache 2025-12-04T12:26:41.5095930Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:26:41.5096242Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:26:41.5096387Z configfile: pytest.ini 2025-12-04T12:26:41.5096650Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:26:41.5096961Z collecting ... collected 1 item 2025-12-04T12:26:41.5097258Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda 2025-12-04T12:26:41.5097563Z Running 1 items in this shard 2025-12-04T12:26:41.5097661Z 2025-12-04T12:26:41.5097960Z distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda I1204 12:26:33.302000 449110 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 449179 2025-12-04T12:26:41.5098503Z I1204 12:26:33.303000 449110 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 449180 2025-12-04T12:26:41.5098910Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:26:41.5099306Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:26:41.5099832Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5100370Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:26:41.5100890Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5101383Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:26:41.5101869Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5102374Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5102894Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5103399Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5103955Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5104451Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:26:41.5104948Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5105512Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:26:41.5106257Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:26:41.5106872Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5107283Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5107872Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5108422Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5108836Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5109303Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:26:41.5109616Z dist init r=1, world=2 2025-12-04T12:26:41.5109859Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:26:41.5110259Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:26:41.5110788Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5111300Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:26:41.5111850Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5112336Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:26:41.5112815Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5113326Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5113828Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5114335Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:26:41.5114842Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5115339Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:26:41.5115872Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5116396Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:26:41.5117063Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:26:41.5117684Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5118073Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5118721Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5119250Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:26:41.5119658Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5120132Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:26:41.5120409Z dist init r=0, world=2 2025-12-04T12:26:41.5120856Z [rank0]:[W1204 12:26:37.066340097 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:26:41.5121314Z FAILED [5.2103s] [100%] 2025-12-04T12:26:41.5121404Z 2025-12-04T12:26:41.5121472Z =================================== FAILURES =================================== 2025-12-04T12:26:41.5121715Z ___________________ TestTraversalCUDA.test_fsdp_modules_cuda ___________________ 2025-12-04T12:26:41.5121920Z Traceback (most recent call last): 2025-12-04T12:26:41.5122197Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:26:41.5122504Z self._join_processes(fn) 2025-12-04T12:26:41.5122788Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:26:41.5123101Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:26:41.5123418Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:26:41.5123724Z raise RuntimeError(error) 2025-12-04T12:26:41.5123941Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:26:41.5124129Z Traceback (most recent call last): 2025-12-04T12:26:41.5124436Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5124722Z getattr(self, test_name)() 2025-12-04T12:26:41.5124995Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5125275Z fn() 2025-12-04T12:26:41.5125516Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5125785Z method(*args, **kwargs) 2025-12-04T12:26:41.5126064Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5126335Z method(*args, **kwargs) 2025-12-04T12:26:41.5126629Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5126902Z with policy(): 2025-12-04T12:26:41.5127158Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5127445Z raise RuntimeError(msg) 2025-12-04T12:26:41.5127865Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:26:41.5128257Z 2025-12-04T12:26:41.5128354Z To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5128719Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5128966Z 2025-12-04T12:26:41.5129080Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5129229Z 2025-12-04T12:26:41.5129230Z 2025-12-04T12:26:41.5129346Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:26:41.5129608Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:26:41.5130015Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-c6de5f5f6db77275.xml - 2025-12-04T12:26:41.5130415Z =========================== short test summary info ============================ 2025-12-04T12:26:41.5130762Z FAILED [5.2103s] distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:26:41.5131084Z Traceback (most recent call last): 2025-12-04T12:26:41.5131396Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:26:41.5131679Z getattr(self, test_name)() 2025-12-04T12:26:41.5131965Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:26:41.5132241Z fn() 2025-12-04T12:26:41.5132479Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5132764Z method(*args, **kwargs) 2025-12-04T12:26:41.5133018Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:26:41.5133278Z method(*args, **kwargs) 2025-12-04T12:26:41.5133551Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:26:41.5133815Z with policy(): 2025-12-04T12:26:41.5134081Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:26:41.5134358Z raise RuntimeError(msg) 2025-12-04T12:26:41.5134776Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:26:41.5135151Z 2025-12-04T12:26:41.5135257Z To execute this test, run the following from the base repo dir: 2025-12-04T12:26:41.5135604Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:26:41.5135842Z 2025-12-04T12:26:41.5143185Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:26:41.5143449Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:26:41.5143696Z ============================== 1 failed in 5.37s =============================== 2025-12-04T12:26:41.5143840Z Got exit code 1 2025-12-04T12:26:41.5144059Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda 2025-12-04T12:26:41.5144390Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:26:41.5144769Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-eecea1a21e81fd93.xml 2025-12-04T12:26:41.5145077Z ============================= test session starts ============================== 2025-12-04T12:26:41.5145293Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:26:41.5145486Z cachedir: .pytest_cache 2025-12-04T12:26:41.5145718Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:26:41.5145962Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:26:41.5146106Z configfile: pytest.ini 2025-12-04T12:26:41.5146338Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:26:41.5146631Z collecting ... collected 1 item / 1 deselected / 0 selected 2025-12-04T12:26:41.5146798Z stepcurrent: skipping 1 already run items. 2025-12-04T12:26:41.5146932Z Running 0 items in this shard 2025-12-04T12:26:41.5147010Z 2025-12-04T12:26:41.5147265Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-eecea1a21e81fd93.xml - 2025-12-04T12:26:41.5147615Z ============================ 1 deselected in 0.00s ============================= 2025-12-04T12:26:41.5147888Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda'] 2025-12-04T12:26:41.5148102Z 2025-12-04T12:26:41.5148343Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_traversal 1/1 (test/test-reports/distributed.fsdp.test_fsdp_traversal_1.1_ef9ad764013e9636_.log) 2025-12-04T12:26:41.5148579Z 2025-12-04T12:26:41.5148717Z Finished distributed/fsdp/test_fsdp_traversal 1/1 ... [2025-12-04 12:26:41.500951][2290700.150130212], took 0.42min 2025-12-04T12:26:41.5149160Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:26:41.5149549Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:26:41.5149771Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:26:41.5149954Z Uploading artifacts took 0.00 seconds 2025-12-04T12:26:41.5150098Z distributed/fsdp/test_fsdp_traversal 1/1 failed! 2025-12-04T12:26:41.5150306Z Running distributed/test_serialization 1/1 ... [2025-12-04 12:26:41.504565][2290700.15374837] 2025-12-04T12:26:41.5150504Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:26:41.5150906Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_serialization.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:26:41.504754] 2025-12-04T12:26:43.9231003Z 2025-12-04T12:26:43.9231794Z distributed/test_serialization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_serialization_1.1_b8711cdeeb133aaa_.log 2025-12-04T12:26:43.9235145Z Running 11 items in this shard: test/distributed/test_serialization.py::TestSerialization::test_cuda, test/distributed/test_serialization.py::TestSerialization::test_dtensor, test/distributed/test_serialization.py::TestSerialization::test_empty_tensor, test/distributed/test_serialization.py::TestSerialization::test_nested_tensors, test/distributed/test_serialization.py::TestSerialization::test_python_object, test/distributed/test_serialization.py::TestSerialization::test_scalar_tensor, test/distributed/test_serialization.py::TestSerialization::test_str_utf8, test/distributed/test_serialization.py::TestSerialization::test_strided_tensor, test/distributed/test_serialization.py::TestSerialization::test_tensor_with_offset, test/distributed/test_serialization.py::TestSerialization::test_various_data_types, test/distributed/test_serialization.py::TestSerialization::test_weights_only 2025-12-04T12:26:43.9237426Z 2025-12-04T12:26:43.9237661Z Finished distributed/test_serialization 1/1 ... [2025-12-04 12:26:43.922746][2290702.571925858], took 0.04min 2025-12-04T12:26:43.9249395Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:26:43.9276187Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:26:43.9280016Z Running distributed/fsdp/test_fsdp_multiple_wrapping 1/1 ... [2025-12-04 12:26:43.927796][2290702.576973201] 2025-12-04T12:26:43.9281658Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:26:43.9282241Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_multiple_wrapping.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:26:43.928082] 2025-12-04T12:27:16.6851764Z 2025-12-04T12:27:16.6856148Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_multiple_wrapping 1/1 (test/test-reports/distributed.fsdp.test_fsdp_multiple_wrapping_1.1_7d9d262da9a8dffa_.log) 2025-12-04T12:27:16.6856992Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-65c7637dc0619de0.xml 2025-12-04T12:27:16.6857472Z ============================= test session starts ============================== 2025-12-04T12:27:16.6857773Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:27:16.6858038Z cachedir: .pytest_cache 2025-12-04T12:27:16.6858418Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:27:16.6858755Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:27:16.6858922Z configfile: pytest.ini 2025-12-04T12:27:16.6859228Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:27:16.6859555Z collecting ... collected 1 item 2025-12-04T12:27:16.6859745Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:27:16.6860179Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda 2025-12-04T12:27:16.6860478Z 2025-12-04T12:27:16.6860904Z distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda I1204 12:26:45.680000 449476 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 449545 2025-12-04T12:27:16.6861610Z I1204 12:26:45.681000 449476 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 449546 2025-12-04T12:27:16.6862073Z I1204 12:26:45.682000 449476 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 449547 2025-12-04T12:27:16.6862532Z I1204 12:26:45.682000 449476 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 449548 2025-12-04T12:27:16.6865419Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6866222Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6867021Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6867657Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6868338Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6869016Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6869640Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6870331Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6870592Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.6870971Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.6871506Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6872034Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.6872557Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6873042Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.6873524Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6874032Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6874535Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6875036Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6875535Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6876081Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.6876572Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6877058Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.6877712Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T12:27:16.6878374Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6878733Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6879356Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6879890Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6880261Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6880680Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:27:16.6880927Z dist init r=3, world=4 2025-12-04T12:27:16.6881138Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.6881479Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.6881971Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6882452Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.6882932Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6883387Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.6883902Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6884369Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6884833Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6885298Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6885804Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6886260Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.6886717Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6887183Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.6887832Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160. 2025-12-04T12:27:16.6888488Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6888867Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6889462Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6889968Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6890337Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6890752Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:27:16.6890998Z dist init r=1, world=4 2025-12-04T12:27:16.6891204Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.6891542Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.6892029Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6892513Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.6892994Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6893443Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.6893883Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6894351Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6894857Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6895323Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6895788Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6896242Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.6896696Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6897167Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.6897812Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3456106496. 2025-12-04T12:27:16.6898513Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6898866Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6899482Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6899988Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6900354Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6900771Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:27:16.6901112Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.6901447Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.6901936Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6902418Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.6902896Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6903344Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.6903790Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6904294Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6904767Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6905237Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6905705Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6906161Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.6906627Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6907110Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.6907778Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944. 2025-12-04T12:27:16.6908423Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6908777Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6909371Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6909880Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6910248Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6910668Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:27:16.6910916Z dist init r=0, world=4 2025-12-04T12:27:16.6911024Z dist init r=2, world=4 2025-12-04T12:27:16.6911130Z FAILED [7.4135s] [100%] 2025-12-04T12:27:16.6911201Z 2025-12-04T12:27:16.6911263Z =================================== FAILURES =================================== 2025-12-04T12:27:16.6911463Z _____________ TestMultipleWrappingCUDA.test_multiple_wrapping_cuda _____________ 2025-12-04T12:27:16.6911650Z Traceback (most recent call last): 2025-12-04T12:27:16.6911904Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:27:16.6912156Z self._join_processes(fn) 2025-12-04T12:27:16.6912405Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:27:16.6912672Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:27:16.6912939Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:27:16.6913201Z raise RuntimeError(error) 2025-12-04T12:27:16.6913360Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:27:16.6913560Z Traceback (most recent call last): 2025-12-04T12:27:16.6913807Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6914050Z getattr(self, test_name)() 2025-12-04T12:27:16.6914289Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6914521Z fn() 2025-12-04T12:27:16.6914727Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6914958Z method(*args, **kwargs) 2025-12-04T12:27:16.6915182Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6915412Z method(*args, **kwargs) 2025-12-04T12:27:16.6915631Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6915858Z with policy(): 2025-12-04T12:27:16.6916070Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6916326Z raise RuntimeError(msg) 2025-12-04T12:27:16.6916739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T12:27:16.6917102Z 2025-12-04T12:27:16.6917177Z To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6917515Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6917781Z 2025-12-04T12:27:16.6917872Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6918003Z 2025-12-04T12:27:16.6918005Z 2025-12-04T12:27:16.6918087Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:27:16.6918328Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:27:16.6918733Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-65c7637dc0619de0.xml - 2025-12-04T12:27:16.6919101Z =========================== short test summary info ============================ 2025-12-04T12:27:16.6919448Z FAILED [7.4135s] distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:27:16.6919775Z Traceback (most recent call last): 2025-12-04T12:27:16.6920023Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6920267Z getattr(self, test_name)() 2025-12-04T12:27:16.6920502Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6920739Z fn() 2025-12-04T12:27:16.6920943Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6921173Z method(*args, **kwargs) 2025-12-04T12:27:16.6921396Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6921625Z method(*args, **kwargs) 2025-12-04T12:27:16.6921843Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6922070Z with policy(): 2025-12-04T12:27:16.6922320Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6922553Z raise RuntimeError(msg) 2025-12-04T12:27:16.6922954Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T12:27:16.6923315Z 2025-12-04T12:27:16.6923392Z To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6923729Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6923988Z 2025-12-04T12:27:16.6924078Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6924267Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:27:16.6924429Z ============================== 1 failed in 7.42s =============================== 2025-12-04T12:27:16.6924563Z Got exit code 1 2025-12-04T12:27:16.6924678Z Retrying single test... 2025-12-04T12:27:16.6924971Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-95628f74af187d69.xml 2025-12-04T12:27:16.6925304Z ============================= test session starts ============================== 2025-12-04T12:27:16.6925516Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:27:16.6925707Z cachedir: .pytest_cache 2025-12-04T12:27:16.6925932Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:27:16.6926170Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:27:16.6926290Z configfile: pytest.ini 2025-12-04T12:27:16.6926522Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:27:16.6926767Z collecting ... collected 1 item 2025-12-04T12:27:16.6927061Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda 2025-12-04T12:27:16.6927360Z Running 1 items in this shard 2025-12-04T12:27:16.6927434Z 2025-12-04T12:27:16.6927740Z distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda I1204 12:26:55.644000 449870 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 449939 2025-12-04T12:27:16.6928272Z I1204 12:26:55.645000 449870 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 449940 2025-12-04T12:27:16.6928615Z I1204 12:26:55.645000 449870 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 449941 2025-12-04T12:27:16.6928956Z I1204 12:26:55.646000 449870 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 449942 2025-12-04T12:27:16.6929649Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6930236Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6930860Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6931443Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6932024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6932605Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6933184Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6933761Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6934044Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.6934386Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.6934891Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6935376Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.6935857Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6936306Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.6936750Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6937217Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6937681Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6938188Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6938655Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6939109Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.6939565Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6940031Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.6940708Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2464153600 and is now 3456106496. 2025-12-04T12:27:16.6941314Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6941665Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6942257Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6942761Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6943134Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6943572Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:27:16.6943832Z dist init r=0, world=4 2025-12-04T12:27:16.6944037Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.6944376Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.6944867Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6945345Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.6945823Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6946271Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.6946712Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6947176Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6947644Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6948108Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6948617Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6949068Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.6949530Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6950031Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.6950682Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160. 2025-12-04T12:27:16.6951284Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6951635Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6952226Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6952743Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6953122Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6953537Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:27:16.6953778Z dist init r=1, world=4 2025-12-04T12:27:16.6953983Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.6954320Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.6954809Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6955290Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.6955772Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6956218Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.6956660Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6957125Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6957590Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6958058Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6958596Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6959046Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.6959543Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6960012Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.6960656Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944. 2025-12-04T12:27:16.6961258Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6961610Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6962209Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6962725Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6963092Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6963507Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:27:16.6963849Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.6964185Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.6964672Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6965149Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.6965624Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6966072Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.6966511Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6966976Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6967436Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6967897Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.6968429Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6968884Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.6969337Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6969800Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.6970442Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T12:27:16.6971042Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6971404Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6972004Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6972505Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.6972870Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6973282Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:27:16.6973524Z dist init r=2, world=4 2025-12-04T12:27:16.6973625Z dist init r=3, world=4 2025-12-04T12:27:16.6973724Z FAILED [7.3131s] [100%] 2025-12-04T12:27:16.6973791Z 2025-12-04T12:27:16.6973851Z =================================== FAILURES =================================== 2025-12-04T12:27:16.6974042Z _____________ TestMultipleWrappingCUDA.test_multiple_wrapping_cuda _____________ 2025-12-04T12:27:16.6974220Z Traceback (most recent call last): 2025-12-04T12:27:16.6974463Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:27:16.6974705Z self._join_processes(fn) 2025-12-04T12:27:16.6974952Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:27:16.6975216Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:27:16.6975485Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:27:16.6975747Z raise RuntimeError(error) 2025-12-04T12:27:16.6975901Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:27:16.6976063Z Traceback (most recent call last): 2025-12-04T12:27:16.6976303Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6976546Z getattr(self, test_name)() 2025-12-04T12:27:16.6976780Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6977011Z fn() 2025-12-04T12:27:16.6977212Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6977470Z method(*args, **kwargs) 2025-12-04T12:27:16.6977691Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6977921Z method(*args, **kwargs) 2025-12-04T12:27:16.6978138Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6978419Z with policy(): 2025-12-04T12:27:16.6978631Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6978860Z raise RuntimeError(msg) 2025-12-04T12:27:16.6979257Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2464153600 and is now 3456106496. 2025-12-04T12:27:16.6979619Z 2025-12-04T12:27:16.6979697Z To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6980033Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6980328Z 2025-12-04T12:27:16.6980416Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6980542Z 2025-12-04T12:27:16.6980544Z 2025-12-04T12:27:16.6980621Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:27:16.6980823Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:27:16.6981219Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-95628f74af187d69.xml - 2025-12-04T12:27:16.6981587Z =========================== short test summary info ============================ 2025-12-04T12:27:16.6981939Z FAILED [7.3131s] distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:27:16.6982263Z Traceback (most recent call last): 2025-12-04T12:27:16.6982507Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6982751Z getattr(self, test_name)() 2025-12-04T12:27:16.6982980Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6983213Z fn() 2025-12-04T12:27:16.6983412Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6983639Z method(*args, **kwargs) 2025-12-04T12:27:16.6983857Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6984087Z method(*args, **kwargs) 2025-12-04T12:27:16.6984304Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.6984529Z with policy(): 2025-12-04T12:27:16.6984740Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.6984972Z raise RuntimeError(msg) 2025-12-04T12:27:16.6985369Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2464153600 and is now 3456106496. 2025-12-04T12:27:16.6985731Z 2025-12-04T12:27:16.6985808Z To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.6986180Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.6986440Z 2025-12-04T12:27:16.6986532Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.6986720Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:27:16.6986877Z ============================== 1 failed in 7.32s =============================== 2025-12-04T12:27:16.6987009Z Got exit code 1 2025-12-04T12:27:16.6987108Z Retrying single test... 2025-12-04T12:27:16.6987402Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-ce0028572e67d7a8.xml 2025-12-04T12:27:16.6987811Z ============================= test session starts ============================== 2025-12-04T12:27:16.6988021Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:27:16.6988247Z cachedir: .pytest_cache 2025-12-04T12:27:16.6988474Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:27:16.6988734Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:27:16.6988853Z configfile: pytest.ini 2025-12-04T12:27:16.6989094Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:27:16.6989335Z collecting ... collected 1 item 2025-12-04T12:27:16.6989628Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda 2025-12-04T12:27:16.6989923Z Running 1 items in this shard 2025-12-04T12:27:16.6989995Z 2025-12-04T12:27:16.6990301Z distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda I1204 12:27:05.605000 450264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 450333 2025-12-04T12:27:16.6990801Z I1204 12:27:05.606000 450264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 450334 2025-12-04T12:27:16.6991147Z I1204 12:27:05.606000 450264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 450335 2025-12-04T12:27:16.6991488Z I1204 12:27:05.607000 450264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 450336 2025-12-04T12:27:16.6992171Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6992753Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6993336Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6993917Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6994497Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6995073Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6995684Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:27:16.6996263Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:27:16.6996501Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.6996842Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.6997330Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.6997811Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.6998354Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.6998833Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.6999272Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.6999737Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.7000205Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7000669Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.7001134Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.7001587Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.7002041Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.7002506Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.7003191Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160. 2025-12-04T12:27:16.7003835Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.7004188Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.7004803Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.7005307Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.7005671Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.7006083Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:27:16.7006324Z dist init r=1, world=4 2025-12-04T12:27:16.7006526Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.7006864Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.7007351Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.7007856Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.7008360Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.7008805Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.7009248Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7009711Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.7010178Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7010639Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.7011101Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.7011554Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.7012010Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.7012479Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.7013124Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296. 2025-12-04T12:27:16.7013730Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.7014108Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.7014696Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.7015200Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.7015563Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.7015976Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:27:16.7016218Z dist init r=3, world=4 2025-12-04T12:27:16.7016422Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.7016774Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.7017272Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.7017750Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.7018259Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.7018710Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.7019149Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7019611Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.7020074Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7020535Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.7021004Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.7021455Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.7021908Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.7022372Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.7023039Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944. 2025-12-04T12:27:16.7023643Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.7023993Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.7024576Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.7025075Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.7025442Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.7025866Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:27:16.7026127Z dist init r=2, world=4 2025-12-04T12:27:16.7026328Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:27:16.7026662Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:27:16.7027147Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.7027628Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:27:16.7028109Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.7028592Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:27:16.7029030Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7029494Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.7029958Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7030419Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:27:16.7030879Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.7031328Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:27:16.7031779Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.7032272Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:27:16.7032912Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3456106496. 2025-12-04T12:27:16.7033517Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.7033864Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.7034449Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.7034949Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:27:16.7035331Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.7035756Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:27:16.7035996Z dist init r=0, world=4 2025-12-04T12:27:16.7036096Z FAILED [7.5129s] [100%] 2025-12-04T12:27:16.7036160Z 2025-12-04T12:27:16.7036221Z =================================== FAILURES =================================== 2025-12-04T12:27:16.7036413Z _____________ TestMultipleWrappingCUDA.test_multiple_wrapping_cuda _____________ 2025-12-04T12:27:16.7036594Z Traceback (most recent call last): 2025-12-04T12:27:16.7036839Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:27:16.7037083Z self._join_processes(fn) 2025-12-04T12:27:16.7037328Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:27:16.7037593Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:27:16.7037859Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:27:16.7038117Z raise RuntimeError(error) 2025-12-04T12:27:16.7038301Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:27:16.7038462Z Traceback (most recent call last): 2025-12-04T12:27:16.7038701Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.7038942Z getattr(self, test_name)() 2025-12-04T12:27:16.7039174Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.7039405Z fn() 2025-12-04T12:27:16.7039607Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7039838Z method(*args, **kwargs) 2025-12-04T12:27:16.7040058Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7040288Z method(*args, **kwargs) 2025-12-04T12:27:16.7040506Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.7040732Z with policy(): 2025-12-04T12:27:16.7040945Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.7041175Z raise RuntimeError(msg) 2025-12-04T12:27:16.7041606Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160. 2025-12-04T12:27:16.7041972Z 2025-12-04T12:27:16.7042047Z To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.7042381Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.7042641Z 2025-12-04T12:27:16.7042730Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.7042855Z 2025-12-04T12:27:16.7042857Z 2025-12-04T12:27:16.7042934Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:27:16.7043133Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:27:16.7043530Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-ce0028572e67d7a8.xml - 2025-12-04T12:27:16.7043910Z =========================== short test summary info ============================ 2025-12-04T12:27:16.7044269Z FAILED [7.5129s] distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:27:16.7044594Z Traceback (most recent call last): 2025-12-04T12:27:16.7044839Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:27:16.7045081Z getattr(self, test_name)() 2025-12-04T12:27:16.7045312Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:27:16.7045544Z fn() 2025-12-04T12:27:16.7045744Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7045975Z method(*args, **kwargs) 2025-12-04T12:27:16.7046193Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:27:16.7046422Z method(*args, **kwargs) 2025-12-04T12:27:16.7046639Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:27:16.7046862Z with policy(): 2025-12-04T12:27:16.7047073Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:27:16.7047302Z raise RuntimeError(msg) 2025-12-04T12:27:16.7047702Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160. 2025-12-04T12:27:16.7048064Z 2025-12-04T12:27:16.7048141Z To execute this test, run the following from the base repo dir: 2025-12-04T12:27:16.7048517Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda 2025-12-04T12:27:16.7048776Z 2025-12-04T12:27:16.7048865Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:27:16.7049053Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:27:16.7049211Z ============================== 1 failed in 7.52s =============================== 2025-12-04T12:27:16.7049347Z Got exit code 1 2025-12-04T12:27:16.7049581Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda 2025-12-04T12:27:16.7049951Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:27:16.7050341Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-e6f42f0989388e92.xml 2025-12-04T12:27:16.7050660Z ============================= test session starts ============================== 2025-12-04T12:27:16.7050868Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:27:16.7051054Z cachedir: .pytest_cache 2025-12-04T12:27:16.7051278Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:27:16.7051515Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:27:16.7051633Z configfile: pytest.ini 2025-12-04T12:27:16.7051863Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:27:16.7052129Z collecting ... collected 1 item / 1 deselected / 0 selected 2025-12-04T12:27:16.7052305Z stepcurrent: skipping 1 already run items. 2025-12-04T12:27:16.7052436Z Running 0 items in this shard 2025-12-04T12:27:16.7052508Z 2025-12-04T12:27:16.7052797Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-e6f42f0989388e92.xml - 2025-12-04T12:27:16.7053160Z ============================ 1 deselected in 0.00s ============================= 2025-12-04T12:27:16.7053461Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda'] 2025-12-04T12:27:16.7053699Z 2025-12-04T12:27:16.7053921Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_multiple_wrapping 1/1 (test/test-reports/distributed.fsdp.test_fsdp_multiple_wrapping_1.1_7d9d262da9a8dffa_.log) 2025-12-04T12:27:16.7054178Z 2025-12-04T12:27:16.7054326Z Finished distributed/fsdp/test_fsdp_multiple_wrapping 1/1 ... [2025-12-04 12:27:16.685328][2290735.334506378], took 0.55min 2025-12-04T12:27:16.7054764Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:27:16.7055156Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:27:16.7055373Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:27:16.7055552Z Uploading artifacts took 0.00 seconds 2025-12-04T12:27:16.7055707Z distributed/fsdp/test_fsdp_multiple_wrapping 1/1 failed! 2025-12-04T12:27:16.7055933Z Running distributed/fsdp/test_fsdp_ignored_modules 1/1 ... [2025-12-04 12:27:16.689038][2290735.338222305] 2025-12-04T12:27:16.7056140Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:27:16.7056552Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_ignored_modules.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:27:16.689216] 2025-12-04T12:28:09.7319075Z 2025-12-04T12:28:09.7319953Z distributed/fsdp/test_fsdp_ignored_modules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_ignored_modules_1.1_7975e69e7f9e6ae8_.log 2025-12-04T12:28:09.7323529Z Running 8 items in this shard: test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_diff_ignored_modules_across_ranks, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_invalid, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_nested, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_not_under_wrapped_root_ignore_modules_False, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_not_under_wrapped_root_ignore_modules_True, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_transformer, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_states_auto_wrap, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_states_check 2025-12-04T12:28:09.7325643Z 2025-12-04T12:28:09.7325850Z Finished distributed/fsdp/test_fsdp_ignored_modules 1/1 ... [2025-12-04 12:28:09.731602][2290788.380781929], took 0.88min 2025-12-04T12:28:09.7333913Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:28:09.7349231Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:28:09.7352421Z Running distributed/fsdp/test_checkpoint_wrapper 1/1 ... [2025-12-04 12:28:09.735163][2290788.384347319] 2025-12-04T12:28:09.7352766Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:28:09.7354333Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_checkpoint_wrapper.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:28:09.735344] 2025-12-04T12:28:13.1548096Z 2025-12-04T12:28:13.1553063Z distributed/fsdp/test_checkpoint_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_checkpoint_wrapper_1.1_d80cb57983854b35_.log 2025-12-04T12:28:13.1556035Z Running 8 items in this shard: test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_apply_activation_checkpointing, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_args_kwargs, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_cpu_offload, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_kwarg_support, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_parity, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_forward_missing_attributes, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_fqn, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_load_activation_checkpointed_module 2025-12-04T12:28:13.1558118Z 2025-12-04T12:28:13.1558400Z Finished distributed/fsdp/test_checkpoint_wrapper 1/1 ... [2025-12-04 12:28:13.154469][2290791.803650169], took 0.06min 2025-12-04T12:28:13.1563452Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:28:13.1578765Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:28:13.1581758Z Running distributed/fsdp/test_fsdp_checkpoint 1/1 ... [2025-12-04 12:28:13.158043][2290791.807226429] 2025-12-04T12:28:13.1582021Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:28:13.1583367Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:28:13.158211] 2025-12-04T12:31:08.7923162Z 2025-12-04T12:31:08.7923922Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_checkpoint 1/1 (test/test-reports/distributed.fsdp.test_fsdp_checkpoint_1.1_18dc4e01a7029ded_.log) 2025-12-04T12:31:08.7924705Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-23ed22c1e35acd9c.xml 2025-12-04T12:31:08.7926051Z ============================= test session starts ============================== 2025-12-04T12:31:08.7926421Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:31:08.7926732Z cachedir: .pytest_cache 2025-12-04T12:31:08.7927105Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:31:08.7927507Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:31:08.7927694Z configfile: pytest.ini 2025-12-04T12:31:08.7928108Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:31:08.7929275Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py) 2025-12-04T12:31:08.7930007Z class TestModel(nn.Module): 2025-12-04T12:31:08.7930184Z collected 17 items 2025-12-04T12:31:08.7930373Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:31:08.7937267Z Running 17 items in this shard: test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.7943964Z 2025-12-04T12:31:08.7944654Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_False I1204 12:28:14.870000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 452374 2025-12-04T12:31:08.7945598Z I1204 12:28:14.870000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 452375 2025-12-04T12:31:08.7946172Z I1204 12:28:14.871000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 452376 2025-12-04T12:31:08.7946739Z I1204 12:28:14.872000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 452377 2025-12-04T12:31:08.7947550Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7948200Z return func(*args, **kwargs) 2025-12-04T12:31:08.7948377Z dist init r=0, world=4 2025-12-04T12:31:08.7948535Z dist init r=3, world=4 2025-12-04T12:31:08.7948689Z dist init r=2, world=4 2025-12-04T12:31:08.7948874Z dist init r=1, world=4 2025-12-04T12:31:08.7949028Z PASSED [8.4128s] [ 5%] 2025-12-04T12:31:08.7949747Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_True I1204 12:28:23.286000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 452707 2025-12-04T12:31:08.7950734Z I1204 12:28:23.287000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 452708 2025-12-04T12:31:08.7951306Z I1204 12:28:23.287000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 452709 2025-12-04T12:31:08.7951875Z I1204 12:28:23.288000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 452710 2025-12-04T12:31:08.7952682Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7953302Z return func(*args, **kwargs) 2025-12-04T12:31:08.7953475Z dist init r=0, world=4 2025-12-04T12:31:08.7953626Z dist init r=3, world=4 2025-12-04T12:31:08.7953776Z dist init r=1, world=4 2025-12-04T12:31:08.7953926Z dist init r=2, world=4 2025-12-04T12:31:08.7954080Z PASSED [8.2110s] [ 11%] 2025-12-04T12:31:08.7954798Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_False I1204 12:28:31.499000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 453040 2025-12-04T12:31:08.7955736Z I1204 12:28:31.499000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 453041 2025-12-04T12:31:08.7956308Z I1204 12:28:31.500000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 453042 2025-12-04T12:31:08.7956875Z I1204 12:28:31.500000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 453043 2025-12-04T12:31:08.7957672Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7958310Z return func(*args, **kwargs) 2025-12-04T12:31:08.7958482Z dist init r=0, world=4 2025-12-04T12:31:08.7958635Z dist init r=3, world=4 2025-12-04T12:31:08.7958786Z dist init r=1, world=4 2025-12-04T12:31:08.7958935Z dist init r=2, world=4 2025-12-04T12:31:08.7959086Z PASSED [8.3114s] [ 17%] 2025-12-04T12:31:08.7959852Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_True I1204 12:28:39.812000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 453373 2025-12-04T12:31:08.7960778Z I1204 12:28:39.812000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 453374 2025-12-04T12:31:08.7961347Z I1204 12:28:39.813000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 453375 2025-12-04T12:31:08.7961913Z I1204 12:28:39.813000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 453376 2025-12-04T12:31:08.7962712Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7963320Z return func(*args, **kwargs) 2025-12-04T12:31:08.7963491Z dist init r=0, world=4 2025-12-04T12:31:08.7963646Z dist init r=3, world=4 2025-12-04T12:31:08.7963797Z dist init r=1, world=4 2025-12-04T12:31:08.7963976Z dist init r=2, world=4 2025-12-04T12:31:08.7964127Z PASSED [8.2111s] [ 23%] 2025-12-04T12:31:08.7964843Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_False I1204 12:28:48.024000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 453706 2025-12-04T12:31:08.7965793Z I1204 12:28:48.025000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 453707 2025-12-04T12:31:08.7966358Z I1204 12:28:48.025000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 453708 2025-12-04T12:31:08.7966923Z I1204 12:28:48.026000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 453709 2025-12-04T12:31:08.7967722Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7968380Z return func(*args, **kwargs) 2025-12-04T12:31:08.7968551Z dist init r=0, world=4 2025-12-04T12:31:08.7968705Z dist init r=3, world=4 2025-12-04T12:31:08.7968857Z dist init r=1, world=4 2025-12-04T12:31:08.7969008Z dist init r=2, world=4 2025-12-04T12:31:08.7969159Z PASSED [8.6112s] [ 29%] 2025-12-04T12:31:08.7969876Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_True I1204 12:28:56.637000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 454039 2025-12-04T12:31:08.7970811Z I1204 12:28:56.637000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 454040 2025-12-04T12:31:08.7971381Z I1204 12:28:56.638000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 454041 2025-12-04T12:31:08.7971951Z I1204 12:28:56.639000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 454042 2025-12-04T12:31:08.7972753Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7973359Z return func(*args, **kwargs) 2025-12-04T12:31:08.7973530Z dist init r=0, world=4 2025-12-04T12:31:08.7973681Z dist init r=3, world=4 2025-12-04T12:31:08.7973832Z dist init r=1, world=4 2025-12-04T12:31:08.7973982Z dist init r=2, world=4 2025-12-04T12:31:08.7974133Z PASSED [8.2112s] [ 35%] 2025-12-04T12:31:08.7974898Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_False I1204 12:29:04.849000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 454372 2025-12-04T12:31:08.7975828Z I1204 12:29:04.850000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 454373 2025-12-04T12:31:08.7976399Z I1204 12:29:04.851000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 454374 2025-12-04T12:31:08.7976965Z I1204 12:29:04.851000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 454375 2025-12-04T12:31:08.7977762Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7978419Z return func(*args, **kwargs) 2025-12-04T12:31:08.7978590Z dist init r=0, world=4 2025-12-04T12:31:08.7978745Z dist init r=3, world=4 2025-12-04T12:31:08.7978897Z dist init r=1, world=4 2025-12-04T12:31:08.7979050Z dist init r=2, world=4 2025-12-04T12:31:08.7979221Z PASSED [8.2109s] [ 41%] 2025-12-04T12:31:08.7979933Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_True I1204 12:29:13.062000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 454705 2025-12-04T12:31:08.7980884Z I1204 12:29:13.063000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 454706 2025-12-04T12:31:08.7981449Z I1204 12:29:13.063000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 454707 2025-12-04T12:31:08.7982012Z I1204 12:29:13.064000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 454708 2025-12-04T12:31:08.7982811Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7983423Z return func(*args, **kwargs) 2025-12-04T12:31:08.7983594Z dist init r=0, world=4 2025-12-04T12:31:08.7983747Z dist init r=3, world=4 2025-12-04T12:31:08.7983897Z dist init r=2, world=4 2025-12-04T12:31:08.7984048Z dist init r=1, world=4 2025-12-04T12:31:08.7984200Z PASSED [8.3114s] [ 47%] 2025-12-04T12:31:08.7984914Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_False I1204 12:29:21.375000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 455038 2025-12-04T12:31:08.7985847Z I1204 12:29:21.375000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 455039 2025-12-04T12:31:08.7986415Z I1204 12:29:21.376000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 455040 2025-12-04T12:31:08.7986982Z I1204 12:29:21.376000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 455041 2025-12-04T12:31:08.7987785Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7988452Z return func(*args, **kwargs) 2025-12-04T12:31:08.7988623Z dist init r=0, world=4 2025-12-04T12:31:08.7988775Z dist init r=3, world=4 2025-12-04T12:31:08.7988924Z dist init r=2, world=4 2025-12-04T12:31:08.7989075Z dist init r=1, world=4 2025-12-04T12:31:08.7989228Z PASSED [8.4115s] [ 52%] 2025-12-04T12:31:08.7989988Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_True I1204 12:29:29.787000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 455371 2025-12-04T12:31:08.7990922Z I1204 12:29:29.788000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 455372 2025-12-04T12:31:08.7991491Z I1204 12:29:29.789000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 455373 2025-12-04T12:31:08.7992055Z I1204 12:29:29.789000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 455374 2025-12-04T12:31:08.7992854Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7993467Z return func(*args, **kwargs) 2025-12-04T12:31:08.7993636Z dist init r=0, world=4 2025-12-04T12:31:08.7993789Z dist init r=3, world=4 2025-12-04T12:31:08.7993944Z dist init r=2, world=4 2025-12-04T12:31:08.7994096Z dist init r=1, world=4 2025-12-04T12:31:08.7994291Z PASSED [8.6121s] [ 58%] 2025-12-04T12:31:08.7995003Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_False I1204 12:29:38.401000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 455704 2025-12-04T12:31:08.7995950Z I1204 12:29:38.402000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 455705 2025-12-04T12:31:08.7996513Z I1204 12:29:38.402000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 455706 2025-12-04T12:31:08.7997073Z I1204 12:29:38.403000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 455707 2025-12-04T12:31:08.7997865Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.7998518Z return func(*args, **kwargs) 2025-12-04T12:31:08.7998687Z dist init r=0, world=4 2025-12-04T12:31:08.7998836Z dist init r=3, world=4 2025-12-04T12:31:08.7998984Z dist init r=1, world=4 2025-12-04T12:31:08.7999131Z dist init r=2, world=4 2025-12-04T12:31:08.7999279Z PASSED [8.2116s] [ 64%] 2025-12-04T12:31:08.7999987Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_True I1204 12:29:46.614000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 456037 2025-12-04T12:31:08.8000913Z I1204 12:29:46.615000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 456038 2025-12-04T12:31:08.8001477Z I1204 12:29:46.615000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 456039 2025-12-04T12:31:08.8002042Z I1204 12:29:46.616000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 456040 2025-12-04T12:31:08.8002837Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.8003443Z return func(*args, **kwargs) 2025-12-04T12:31:08.8003611Z dist init r=0, world=4 2025-12-04T12:31:08.8003761Z dist init r=3, world=4 2025-12-04T12:31:08.8003909Z dist init r=1, world=4 2025-12-04T12:31:08.8004057Z dist init r=2, world=4 2025-12-04T12:31:08.8004206Z PASSED [8.5117s] [ 70%] 2025-12-04T12:31:08.8004965Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_False I1204 12:29:55.127000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 456370 2025-12-04T12:31:08.8005895Z I1204 12:29:55.128000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 456371 2025-12-04T12:31:08.8006458Z I1204 12:29:55.129000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 456372 2025-12-04T12:31:08.8007022Z I1204 12:29:55.129000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 456373 2025-12-04T12:31:08.8007816Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.8008451Z return func(*args, **kwargs) 2025-12-04T12:31:08.8008619Z dist init r=3, world=4 2025-12-04T12:31:08.8008770Z dist init r=0, world=4 2025-12-04T12:31:08.8008922Z dist init r=1, world=4 2025-12-04T12:31:08.8009070Z dist init r=2, world=4 2025-12-04T12:31:08.8009219Z PASSED [8.8118s] [ 76%] 2025-12-04T12:31:08.8009960Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_True I1204 12:30:03.941000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 456703 2025-12-04T12:31:08.8010902Z I1204 12:30:03.941000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 456704 2025-12-04T12:31:08.8011464Z I1204 12:30:03.942000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 456705 2025-12-04T12:31:08.8012025Z I1204 12:30:03.943000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 456706 2025-12-04T12:31:08.8012819Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.8013425Z return func(*args, **kwargs) 2025-12-04T12:31:08.8013591Z dist init r=0, world=4 2025-12-04T12:31:08.8013740Z dist init r=3, world=4 2025-12-04T12:31:08.8013889Z dist init r=1, world=4 2025-12-04T12:31:08.8014038Z dist init r=2, world=4 2025-12-04T12:31:08.8014187Z PASSED [8.4122s] [ 82%] 2025-12-04T12:31:08.8014895Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_False I1204 12:30:12.354000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 457036 2025-12-04T12:31:08.8015821Z I1204 12:30:12.355000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 457037 2025-12-04T12:31:08.8016388Z I1204 12:30:12.356000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 457038 2025-12-04T12:31:08.8016952Z I1204 12:30:12.356000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 457039 2025-12-04T12:31:08.8017748Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.8018388Z return func(*args, **kwargs) 2025-12-04T12:31:08.8018555Z dist init r=0, world=4 2025-12-04T12:31:08.8018701Z dist init r=3, world=4 2025-12-04T12:31:08.8018849Z dist init r=1, world=4 2025-12-04T12:31:08.8018997Z dist init r=2, world=4 2025-12-04T12:31:08.8019145Z PASSED [8.5126s] [ 88%] 2025-12-04T12:31:08.8019928Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_True I1204 12:30:20.868000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 457369 2025-12-04T12:31:08.8020862Z I1204 12:30:20.869000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 457370 2025-12-04T12:31:08.8021426Z I1204 12:30:20.870000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 457371 2025-12-04T12:31:08.8021990Z I1204 12:30:20.870000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 457372 2025-12-04T12:31:08.8022783Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:31:08.8023390Z return func(*args, **kwargs) 2025-12-04T12:31:08.8023558Z dist init r=0, world=4 2025-12-04T12:31:08.8023707Z dist init r=3, world=4 2025-12-04T12:31:08.8023856Z dist init r=2, world=4 2025-12-04T12:31:08.8024009Z dist init r=1, world=4 2025-12-04T12:31:08.8024161Z PASSED [8.3113s] [ 94%] 2025-12-04T12:31:08.8024851Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda I1204 12:30:29.182000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 457702 2025-12-04T12:31:08.8025750Z I1204 12:30:29.182000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 457703 2025-12-04T12:31:08.8026313Z I1204 12:30:29.183000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 457704 2025-12-04T12:31:08.8026873Z I1204 12:30:29.183000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 457705 2025-12-04T12:31:08.8027649Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8028333Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8029354Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8030337Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8030947Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8031587Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8032599Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8033576Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8034179Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8034818Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8035500Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8036150Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8036798Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8037442Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8038086Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8038768Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8039403Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8040046Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8040711Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8041378Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8042021Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8042656Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8043671Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8044646Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8045248Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8045881Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8046514Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8047155Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8047801Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8048480Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8049131Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8049768Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8050780Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8051797Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8052398Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8053035Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8053669Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8054312Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8054956Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8055602Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8057994Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8060497Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8062935Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8065345Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8067810Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8070252Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8072677Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8075101Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8075596Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8076160Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8076984Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8077792Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8078647Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8079398Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8080140Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8080920Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8081699Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8082480Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8083257Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8084014Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8084813Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8085593Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8086770Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664. 2025-12-04T12:31:08.8087876Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8088493Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8089546Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8090487Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8091092Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8091784Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:31:08.8092178Z dist init r=2, world=4 2025-12-04T12:31:08.8092509Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8093067Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8093880Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8094684Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8095482Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8096230Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8096964Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8097743Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8098561Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8099336Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8100156Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8100910Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8101674Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8102455Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8103627Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:31:08.8104732Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8105327Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8106412Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8107319Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8107926Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8108653Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:31:08.8109050Z dist init r=3, world=4 2025-12-04T12:31:08.8109374Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8109932Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8110747Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8111554Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8112353Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8113099Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8113833Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8114606Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8115387Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8116201Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8116976Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8117732Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8118535Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8119312Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8120488Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216. 2025-12-04T12:31:08.8121625Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8122200Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8123255Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8124160Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8124763Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8125452Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:31:08.8125845Z dist init r=0, world=4 2025-12-04T12:31:08.8126167Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8126724Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8127539Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8128379Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8129181Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8129928Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8130663Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8131488Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8132264Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8133041Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8133816Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8134569Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8135326Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8136120Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8137308Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880. 2025-12-04T12:31:08.8138453Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8139034Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8140076Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8140980Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8141581Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8142268Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:31:08.8142665Z dist init r=1, world=4 2025-12-04T12:31:08.8143337Z [rank0]:[W1204 12:30:36.112743687 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:31:08.8144014Z FAILED [8.8121s] [100%] 2025-12-04T12:31:08.8144112Z 2025-12-04T12:31:08.8144200Z =================================== FAILURES =================================== 2025-12-04T12:31:08.8144541Z _ TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda _ 2025-12-04T12:31:08.8144867Z Traceback (most recent call last): 2025-12-04T12:31:08.8145260Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:31:08.8145656Z self._join_processes(fn) 2025-12-04T12:31:08.8146054Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:31:08.8146527Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:31:08.8146962Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:31:08.8147388Z raise RuntimeError(error) 2025-12-04T12:31:08.8147624Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:31:08.8147879Z Traceback (most recent call last): 2025-12-04T12:31:08.8148304Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8148696Z getattr(self, test_name)() 2025-12-04T12:31:08.8149070Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8149450Z fn() 2025-12-04T12:31:08.8149774Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8150150Z method(*args, **kwargs) 2025-12-04T12:31:08.8150509Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8150903Z method(*args, **kwargs) 2025-12-04T12:31:08.8151254Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8151644Z with policy(): 2025-12-04T12:31:08.8151985Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8152362Z raise RuntimeError(msg) 2025-12-04T12:31:08.8153104Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664. 2025-12-04T12:31:08.8153801Z 2025-12-04T12:31:08.8153920Z To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8154540Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8155045Z 2025-12-04T12:31:08.8155186Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8155388Z 2025-12-04T12:31:08.8155390Z 2025-12-04T12:31:08.8155513Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:31:08.8155833Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:31:08.8156458Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-23ed22c1e35acd9c.xml - 2025-12-04T12:31:08.8157033Z =========================== short test summary info ============================ 2025-12-04T12:31:08.8157669Z FAILED [8.8121s] distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:31:08.8158308Z Traceback (most recent call last): 2025-12-04T12:31:08.8158705Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8159100Z getattr(self, test_name)() 2025-12-04T12:31:08.8159475Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8159854Z fn() 2025-12-04T12:31:08.8160177Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8160551Z method(*args, **kwargs) 2025-12-04T12:31:08.8160947Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8161321Z method(*args, **kwargs) 2025-12-04T12:31:08.8161675Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8162044Z with policy(): 2025-12-04T12:31:08.8162384Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8162763Z raise RuntimeError(msg) 2025-12-04T12:31:08.8163509Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664. 2025-12-04T12:31:08.8164206Z 2025-12-04T12:31:08.8164321Z To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8164942Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8165500Z 2025-12-04T12:31:08.8165639Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8165961Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:31:08.8166223Z =================== 1 failed, 16 passed in 143.13s (0:02:23) =================== 2025-12-04T12:31:08.8166444Z Got exit code 1 2025-12-04T12:31:08.8166590Z Retrying single test... 2025-12-04T12:31:08.8167034Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-b6017cfe350bdc50.xml 2025-12-04T12:31:08.8167531Z ============================= test session starts ============================== 2025-12-04T12:31:08.8167866Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:31:08.8168211Z cachedir: .pytest_cache 2025-12-04T12:31:08.8168570Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:31:08.8168959Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:31:08.8169143Z configfile: pytest.ini 2025-12-04T12:31:08.8169506Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:31:08.8170393Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py) 2025-12-04T12:31:08.8171079Z class TestModel(nn.Module): 2025-12-04T12:31:08.8171273Z collected 17 items / 16 deselected / 1 selected 2025-12-04T12:31:08.8171850Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8172404Z Running 1 items in this shard 2025-12-04T12:31:08.8172516Z 2025-12-04T12:31:08.8173094Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda I1204 12:30:40.453000 458035 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 458104 2025-12-04T12:31:08.8173990Z I1204 12:30:40.454000 458035 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 458105 2025-12-04T12:31:08.8174558Z I1204 12:30:40.454000 458035 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 458106 2025-12-04T12:31:08.8175123Z I1204 12:30:40.455000 458035 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 458107 2025-12-04T12:31:08.8175943Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8176586Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8177608Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8178632Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8179238Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8179881Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8180535Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8181214Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8181862Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8182504Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8183150Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8183790Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8184807Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8185784Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8186387Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8187025Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8187662Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8188333Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8188966Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8189611Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8190634Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8191644Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8192245Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8192891Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8193535Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8194170Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8194806Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8195447Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8196092Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8196755Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8197422Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8198057Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8199109Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8200084Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8200688Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8201328Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8201967Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8202614Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8203261Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8203906Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8206324Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8208768Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8211200Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8213637Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8216062Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8218509Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8220942Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8223351Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8223845Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8224407Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8225257Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8226061Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8226862Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8227609Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8228396Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8229177Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8229977Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8230766Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8231539Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8232293Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8233053Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8233834Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8235014Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216. 2025-12-04T12:31:08.8236120Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8236701Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8237760Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8238715Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8239319Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8240008Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:31:08.8240404Z dist init r=0, world=4 2025-12-04T12:31:08.8240772Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8241332Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8242148Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8242947Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8243747Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8244498Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8245256Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8246049Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8246824Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8247598Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8248404Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8249163Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8249929Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8250711Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8251888Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:31:08.8252999Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8253578Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8254628Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8255532Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8256177Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8256865Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:31:08.8257262Z dist init r=3, world=4 2025-12-04T12:31:08.8257586Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8258180Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8258993Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8259796Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8260595Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8261380Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8262113Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8262891Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8263671Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8264448Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8265222Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8265978Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8266735Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8267516Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8268717Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880. 2025-12-04T12:31:08.8269818Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8270394Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8271484Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8272387Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8272992Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8273684Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:31:08.8274079Z dist init r=1, world=4 2025-12-04T12:31:08.8274400Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8274958Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8275773Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8276610Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8277411Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8278194Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8278928Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8279703Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8280476Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8281251Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8282024Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8282778Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8283537Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8284317Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8285490Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664. 2025-12-04T12:31:08.8286586Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8287198Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8288277Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8289177Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8289780Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8290468Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:31:08.8290865Z dist init r=2, world=4 2025-12-04T12:31:08.8291523Z [rank0]:[W1204 12:30:52.247419805 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:31:08.8292241Z FAILED [13.7172s] [100%] 2025-12-04T12:31:08.8292345Z 2025-12-04T12:31:08.8292429Z =================================== FAILURES =================================== 2025-12-04T12:31:08.8292768Z _ TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda _ 2025-12-04T12:31:08.8293094Z Traceback (most recent call last): 2025-12-04T12:31:08.8293485Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:31:08.8293880Z self._join_processes(fn) 2025-12-04T12:31:08.8294278Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:31:08.8294707Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:31:08.8295147Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:31:08.8295574Z raise RuntimeError(error) 2025-12-04T12:31:08.8295810Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:31:08.8296065Z Traceback (most recent call last): 2025-12-04T12:31:08.8296454Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8296846Z getattr(self, test_name)() 2025-12-04T12:31:08.8297222Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8297600Z fn() 2025-12-04T12:31:08.8297925Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8298337Z method(*args, **kwargs) 2025-12-04T12:31:08.8298698Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8299072Z method(*args, **kwargs) 2025-12-04T12:31:08.8299424Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8299793Z with policy(): 2025-12-04T12:31:08.8300133Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8300515Z raise RuntimeError(msg) 2025-12-04T12:31:08.8301314Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216. 2025-12-04T12:31:08.8302011Z 2025-12-04T12:31:08.8302127Z To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8302745Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8303252Z 2025-12-04T12:31:08.8303391Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8303595Z 2025-12-04T12:31:08.8303597Z 2025-12-04T12:31:08.8303718Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:31:08.8304036Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:31:08.8304657Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-b6017cfe350bdc50.xml - 2025-12-04T12:31:08.8305236Z =========================== short test summary info ============================ 2025-12-04T12:31:08.8305883Z FAILED [13.7172s] distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:31:08.8306501Z Traceback (most recent call last): 2025-12-04T12:31:08.8306892Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8307286Z getattr(self, test_name)() 2025-12-04T12:31:08.8307659Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8308036Z fn() 2025-12-04T12:31:08.8308399Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8308774Z method(*args, **kwargs) 2025-12-04T12:31:08.8309128Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8309503Z method(*args, **kwargs) 2025-12-04T12:31:08.8309857Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8310227Z with policy(): 2025-12-04T12:31:08.8310566Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8310943Z raise RuntimeError(msg) 2025-12-04T12:31:08.8311688Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216. 2025-12-04T12:31:08.8312384Z 2025-12-04T12:31:08.8312501Z To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8313122Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8313628Z 2025-12-04T12:31:08.8313765Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8314062Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:31:08.8314321Z ====================== 1 failed, 16 deselected in 13.73s ======================= 2025-12-04T12:31:08.8314537Z Got exit code 1 2025-12-04T12:31:08.8314684Z Retrying single test... 2025-12-04T12:31:08.8315129Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-ba679004c1dc5cc7.xml 2025-12-04T12:31:08.8315664Z ============================= test session starts ============================== 2025-12-04T12:31:08.8315994Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:31:08.8316295Z cachedir: .pytest_cache 2025-12-04T12:31:08.8316651Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:31:08.8317040Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:31:08.8317221Z configfile: pytest.ini 2025-12-04T12:31:08.8317583Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:31:08.8318508Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py) 2025-12-04T12:31:08.8319195Z class TestModel(nn.Module): 2025-12-04T12:31:08.8319392Z collected 17 items / 16 deselected / 1 selected 2025-12-04T12:31:08.8319962Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8320542Z Running 1 items in this shard 2025-12-04T12:31:08.8320676Z 2025-12-04T12:31:08.8321250Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda I1204 12:30:56.748000 458437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 458506 2025-12-04T12:31:08.8322134Z I1204 12:30:56.749000 458437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 458507 2025-12-04T12:31:08.8322701Z I1204 12:30:56.749000 458437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 458508 2025-12-04T12:31:08.8323268Z I1204 12:30:56.750000 458437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 458509 2025-12-04T12:31:08.8324038Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8324682Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8325702Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8326676Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8327282Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8327922Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8328600Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8329247Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8329893Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8330536Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8331222Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8331862Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8332877Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8333849Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8334456Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8335095Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8335737Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8345806Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8346686Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8347345Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8348002Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8348698Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8349727Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8350728Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8351341Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8351982Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8352623Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8353274Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8353924Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8354572Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8355223Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8355863Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8356931Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:31:08.8357910Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:31:08.8358542Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8359177Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8359814Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8360458Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:31:08.8361107Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:31:08.8361768Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:31:08.8364132Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8366571Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8369052Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8371459Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8373917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8376329Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8378800Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:31:08.8381232Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:31:08.8381731Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8382297Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8383123Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8383929Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8384734Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8385486Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8386228Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8387013Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8387794Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8388606Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8389384Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8390756Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8391518Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8392307Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8393495Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:31:08.8394608Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8395191Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8396274Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8397181Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8397789Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8398517Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:31:08.8398917Z dist init r=3, world=4 2025-12-04T12:31:08.8399250Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8399812Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8400631Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8401435Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8402243Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8402995Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8403733Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8404512Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8405292Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8406116Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8406893Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8407651Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8408453Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8409235Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8410416Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2462056448 and is now 3948937216. 2025-12-04T12:31:08.8411555Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8412132Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8413182Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8414091Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8414697Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8415391Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:31:08.8415791Z dist init r=0, world=4 2025-12-04T12:31:08.8416115Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8416676Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8417494Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8418334Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8419141Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8419892Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8420629Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8421407Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8422222Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8422998Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8423776Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8424533Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8425294Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8426076Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8427281Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880. 2025-12-04T12:31:08.8428462Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8429041Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8430093Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8430999Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8431604Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8432297Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:31:08.8432692Z dist init r=1, world=4 2025-12-04T12:31:08.8433016Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:31:08.8433577Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:31:08.8434394Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8435203Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:31:08.8436011Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8436765Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:31:08.8437542Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8438360Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8439141Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8439919Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:31:08.8440699Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8441460Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:31:08.8442241Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8443039Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:31:08.8444221Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664. 2025-12-04T12:31:08.8445325Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8445904Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8446963Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8447874Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:31:08.8448511Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8449204Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:31:08.8449599Z dist init r=2, world=4 2025-12-04T12:31:08.8450260Z [rank0]:[W1204 12:31:04.801497155 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:31:08.8450945Z FAILED [8.9146s] [100%] 2025-12-04T12:31:08.8451043Z 2025-12-04T12:31:08.8451134Z =================================== FAILURES =================================== 2025-12-04T12:31:08.8451478Z _ TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda _ 2025-12-04T12:31:08.8451805Z Traceback (most recent call last): 2025-12-04T12:31:08.8452201Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:31:08.8452599Z self._join_processes(fn) 2025-12-04T12:31:08.8453037Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:31:08.8453471Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:31:08.8453909Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:31:08.8454338Z raise RuntimeError(error) 2025-12-04T12:31:08.8454578Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:31:08.8454834Z Traceback (most recent call last): 2025-12-04T12:31:08.8455222Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8455618Z getattr(self, test_name)() 2025-12-04T12:31:08.8455993Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8456376Z fn() 2025-12-04T12:31:08.8456700Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8457094Z method(*args, **kwargs) 2025-12-04T12:31:08.8457452Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8457844Z method(*args, **kwargs) 2025-12-04T12:31:08.8458237Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8458598Z with policy(): 2025-12-04T12:31:08.8458822Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8459054Z raise RuntimeError(msg) 2025-12-04T12:31:08.8459505Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:31:08.8459918Z 2025-12-04T12:31:08.8459995Z To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8460373Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8460674Z 2025-12-04T12:31:08.8460763Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8460890Z 2025-12-04T12:31:08.8460892Z 2025-12-04T12:31:08.8460972Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:31:08.8461176Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:31:08.8461564Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-ba679004c1dc5cc7.xml - 2025-12-04T12:31:08.8461913Z =========================== short test summary info ============================ 2025-12-04T12:31:08.8462295Z FAILED [8.9146s] distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:31:08.8462653Z Traceback (most recent call last): 2025-12-04T12:31:08.8462902Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:31:08.8463147Z getattr(self, test_name)() 2025-12-04T12:31:08.8463384Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:31:08.8463620Z fn() 2025-12-04T12:31:08.8463870Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8464100Z method(*args, **kwargs) 2025-12-04T12:31:08.8464321Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:31:08.8464557Z method(*args, **kwargs) 2025-12-04T12:31:08.8464781Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:31:08.8465013Z with policy(): 2025-12-04T12:31:08.8465234Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:31:08.8465471Z raise RuntimeError(msg) 2025-12-04T12:31:08.8465930Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:31:08.8466348Z 2025-12-04T12:31:08.8466426Z To execute this test, run the following from the base repo dir: 2025-12-04T12:31:08.8466827Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8467146Z 2025-12-04T12:31:08.8467236Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:31:08.8467429Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:31:08.8467602Z ======================= 1 failed, 16 deselected in 8.93s ======================= 2025-12-04T12:31:08.8467748Z Got exit code 1 2025-12-04T12:31:08.8468023Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:31:08.8468447Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:31:08.8468836Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-8a208df4594f8f27.xml 2025-12-04T12:31:08.8469152Z ============================= test session starts ============================== 2025-12-04T12:31:08.8469373Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:31:08.8469568Z cachedir: .pytest_cache 2025-12-04T12:31:08.8469801Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:31:08.8470048Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:31:08.8470175Z configfile: pytest.ini 2025-12-04T12:31:08.8470410Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:31:08.8470955Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py) 2025-12-04T12:31:08.8471371Z class TestModel(nn.Module): 2025-12-04T12:31:08.8471501Z collected 17 items / 17 deselected / 0 selected 2025-12-04T12:31:08.8471646Z stepcurrent: skipping 17 already run items. 2025-12-04T12:31:08.8471777Z Running 0 items in this shard 2025-12-04T12:31:08.8471850Z 2025-12-04T12:31:08.8472112Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-8a208df4594f8f27.xml - 2025-12-04T12:31:08.8472463Z ============================ 17 deselected in 0.01s ============================ 2025-12-04T12:31:08.8472834Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda'] 2025-12-04T12:31:08.8473109Z 2025-12-04T12:31:08.8473318Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_checkpoint 1/1 (test/test-reports/distributed.fsdp.test_fsdp_checkpoint_1.1_18dc4e01a7029ded_.log) 2025-12-04T12:31:08.8473557Z 2025-12-04T12:31:08.8473697Z Finished distributed/fsdp/test_fsdp_checkpoint 1/1 ... [2025-12-04 12:31:08.792587][2290967.441765475], took 2.93min 2025-12-04T12:31:08.8474145Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:31:08.8474532Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:31:08.8474753Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:31:08.8474934Z Uploading artifacts took 0.00 seconds 2025-12-04T12:31:08.8475077Z distributed/fsdp/test_fsdp_checkpoint 1/1 failed! 2025-12-04T12:31:08.8475287Z Running distributed/fsdp/test_fsdp_fine_tune 1/1 ... [2025-12-04 12:31:08.796400][2290967.445583852] 2025-12-04T12:31:08.8475503Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:31:08.8475906Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_fine_tune.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:31:08.796573] 2025-12-04T12:33:30.8886288Z 2025-12-04T12:33:30.8889587Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_fine_tune 1/1 (test/test-reports/distributed.fsdp.test_fsdp_fine_tune_1.1_f2107156872849a9_.log) 2025-12-04T12:33:30.8892005Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-d1b74c890111edb9.xml 2025-12-04T12:33:30.8892381Z ============================= test session starts ============================== 2025-12-04T12:33:30.8892651Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.8892855Z cachedir: .pytest_cache 2025-12-04T12:33:30.8893085Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.8893337Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.8893456Z configfile: pytest.ini 2025-12-04T12:33:30.8893690Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.8893937Z collecting ... collected 4 items 2025-12-04T12:33:30.8894082Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:33:30.8894805Z Running 4 items in this shard: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda, test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda, test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda, test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.8895474Z 2025-12-04T12:33:30.8895773Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda I1204 12:31:10.491000 458907 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 458976 2025-12-04T12:33:30.8896259Z I1204 12:31:10.491000 458907 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 458977 2025-12-04T12:33:30.8897559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.8898233Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.8898823Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.8899407Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.8899798Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.8900168Z return func(*args, **kwargs) 2025-12-04T12:33:30.8900537Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.8900964Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.8901335Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.8901761Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.8902116Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.8902456Z seq = FSDP( 2025-12-04T12:33:30.8902776Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.8903114Z seq = FSDP( 2025-12-04T12:33:30.8904451Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.8905894Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.8907368Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.8908861Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.8909173Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.8909519Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.8910020Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.8910505Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.8910985Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.8911473Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.8911916Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8912384Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.8912857Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8913323Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.8913787Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.8914242Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.8914698Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.8915166Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.8915818Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:33:30.8916430Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.8916784Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.8917391Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.8917883Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.8918311Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.8918729Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.8918972Z dist init r=1, world=2 2025-12-04T12:33:30.8919181Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.8919521Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.8920010Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.8920505Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.8921004Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.8921456Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.8921898Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8922365Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.8922833Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8923301Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.8923764Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.8924218Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.8924675Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.8925141Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.8925797Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:33:30.8926405Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.8926793Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.8927364Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.8927852Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.8928260Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.8928675Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.8928917Z dist init r=0, world=2 2025-12-04T12:33:30.8929337Z [rank0]:[W1204 12:31:17.642245558 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.8929786Z FAILED [8.9124s] [ 25%] 2025-12-04T12:33:30.8929852Z 2025-12-04T12:33:30.8929935Z =================================== FAILURES =================================== 2025-12-04T12:33:30.8930128Z ____________ TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda _____________ 2025-12-04T12:33:30.8930303Z Traceback (most recent call last): 2025-12-04T12:33:30.8930550Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.8930794Z self._join_processes(fn) 2025-12-04T12:33:30.8931042Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.8931307Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.8931577Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.8931838Z raise RuntimeError(error) 2025-12-04T12:33:30.8931991Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:33:30.8932155Z Traceback (most recent call last): 2025-12-04T12:33:30.8932396Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.8932638Z getattr(self, test_name)() 2025-12-04T12:33:30.8933122Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.8933355Z fn() 2025-12-04T12:33:30.8933560Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8933792Z method(*args, **kwargs) 2025-12-04T12:33:30.8934017Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8934248Z method(*args, **kwargs) 2025-12-04T12:33:30.8934468Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.8934696Z with policy(): 2025-12-04T12:33:30.8934912Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.8935142Z raise RuntimeError(msg) 2025-12-04T12:33:30.8935540Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:33:30.8935903Z 2025-12-04T12:33:30.8935981Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.8936342Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.8936591Z 2025-12-04T12:33:30.8936684Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.8936810Z 2025-12-04T12:33:30.8936812Z 2025-12-04T12:33:30.8936894Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.8937098Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.8937476Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-d1b74c890111edb9.xml - 2025-12-04T12:33:30.8937823Z =========================== short test summary info ============================ 2025-12-04T12:33:30.8938204Z FAILED [8.9124s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:33:30.8938535Z Traceback (most recent call last): 2025-12-04T12:33:30.8938784Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.8939048Z getattr(self, test_name)() 2025-12-04T12:33:30.8939281Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.8939513Z fn() 2025-12-04T12:33:30.8939715Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8939944Z method(*args, **kwargs) 2025-12-04T12:33:30.8940165Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8940568Z method(*args, **kwargs) 2025-12-04T12:33:30.8940791Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.8941017Z with policy(): 2025-12-04T12:33:30.8941227Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.8941459Z raise RuntimeError(msg) 2025-12-04T12:33:30.8941855Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:33:30.8942219Z 2025-12-04T12:33:30.8942294Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.8942616Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.8942864Z 2025-12-04T12:33:30.8942952Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.8943142Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.8943302Z ============================== 1 failed in 8.92s =============================== 2025-12-04T12:33:30.8943436Z Got exit code 1 2025-12-04T12:33:30.8943532Z Retrying single test... 2025-12-04T12:33:30.8943805Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-133ef5108b952965.xml 2025-12-04T12:33:30.8944103Z ============================= test session starts ============================== 2025-12-04T12:33:30.8944316Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.8944504Z cachedir: .pytest_cache 2025-12-04T12:33:30.8944767Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.8945009Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.8945130Z configfile: pytest.ini 2025-12-04T12:33:30.8945359Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.8945631Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:33:30.8945942Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.8946224Z Running 1 items in this shard 2025-12-04T12:33:30.8946299Z 2025-12-04T12:33:30.8946595Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda I1204 12:31:21.924000 459143 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459212 2025-12-04T12:33:30.8947078Z I1204 12:31:21.924000 459143 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459213 2025-12-04T12:33:30.8947793Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.8948465Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.8949052Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.8949634Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.8950024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.8950395Z return func(*args, **kwargs) 2025-12-04T12:33:30.8950750Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.8951115Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.8951470Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.8951826Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.8952177Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.8952517Z seq = FSDP( 2025-12-04T12:33:30.8952836Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.8953170Z seq = FSDP( 2025-12-04T12:33:30.8954538Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.8955965Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.8957406Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.8958892Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.8959199Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.8959545Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.8960040Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.8960526Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.8961009Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.8961465Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.8961912Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8962380Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.8962847Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8963311Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.8963773Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.8964265Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.8964724Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.8965192Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.8965846Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2021654528 and is now 3539992576. 2025-12-04T12:33:30.8966458Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.8966809Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.8967397Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.8967899Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.8968340Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.8968758Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.8969002Z dist init r=0, world=2 2025-12-04T12:33:30.8969206Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.8969547Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.8970068Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.8970550Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.8971032Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.8971480Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.8971923Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8972395Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.8972859Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8973322Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.8973842Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.8974298Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.8974757Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.8975226Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.8975874Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:33:30.8976495Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.8976867Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.8977440Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.8977930Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.8978351Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.8978767Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.8979010Z dist init r=1, world=2 2025-12-04T12:33:30.8979412Z [rank0]:[W1204 12:31:29.981792807 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.8979821Z FAILED [9.0119s] [100%] 2025-12-04T12:33:30.8979888Z 2025-12-04T12:33:30.8979946Z =================================== FAILURES =================================== 2025-12-04T12:33:30.8980136Z ____________ TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda _____________ 2025-12-04T12:33:30.8980312Z Traceback (most recent call last): 2025-12-04T12:33:30.8980560Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.8980805Z self._join_processes(fn) 2025-12-04T12:33:30.8981054Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.8981321Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.8981592Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.8981855Z raise RuntimeError(error) 2025-12-04T12:33:30.8982008Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.8982171Z Traceback (most recent call last): 2025-12-04T12:33:30.8982412Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.8982694Z getattr(self, test_name)() 2025-12-04T12:33:30.8982927Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.8983163Z fn() 2025-12-04T12:33:30.8983366Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8983598Z method(*args, **kwargs) 2025-12-04T12:33:30.8983820Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8984051Z method(*args, **kwargs) 2025-12-04T12:33:30.8984270Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.8984496Z with policy(): 2025-12-04T12:33:30.8984709Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.8984942Z raise RuntimeError(msg) 2025-12-04T12:33:30.8985344Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2021654528 and is now 3539992576. 2025-12-04T12:33:30.8985740Z 2025-12-04T12:33:30.8985816Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.8986138Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.8986386Z 2025-12-04T12:33:30.8986476Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.8986603Z 2025-12-04T12:33:30.8986605Z 2025-12-04T12:33:30.8986683Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.8986889Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.8987262Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-133ef5108b952965.xml - 2025-12-04T12:33:30.8987606Z =========================== short test summary info ============================ 2025-12-04T12:33:30.8987939Z FAILED [9.0119s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.8988320Z Traceback (most recent call last): 2025-12-04T12:33:30.8988567Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.8988811Z getattr(self, test_name)() 2025-12-04T12:33:30.8989045Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.8989279Z fn() 2025-12-04T12:33:30.8989482Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8989713Z method(*args, **kwargs) 2025-12-04T12:33:30.8989935Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.8990164Z method(*args, **kwargs) 2025-12-04T12:33:30.8990383Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.8990608Z with policy(): 2025-12-04T12:33:30.8990817Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.8991049Z raise RuntimeError(msg) 2025-12-04T12:33:30.8991488Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2021654528 and is now 3539992576. 2025-12-04T12:33:30.8991855Z 2025-12-04T12:33:30.8991933Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.8992258Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.8992502Z 2025-12-04T12:33:30.8992592Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.8992781Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.8992949Z ======================= 1 failed, 3 deselected in 9.02s ======================== 2025-12-04T12:33:30.8993087Z Got exit code 1 2025-12-04T12:33:30.8993185Z Retrying single test... 2025-12-04T12:33:30.8993459Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-cdaf34baac2ba9f9.xml 2025-12-04T12:33:30.8993774Z ============================= test session starts ============================== 2025-12-04T12:33:30.8993986Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.8994191Z cachedir: .pytest_cache 2025-12-04T12:33:30.8994414Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.8994653Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.8994774Z configfile: pytest.ini 2025-12-04T12:33:30.8995003Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.8995274Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:33:30.8995588Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.8995869Z Running 1 items in this shard 2025-12-04T12:33:30.8995944Z 2025-12-04T12:33:30.8996241Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda I1204 12:31:33.520000 459379 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459448 2025-12-04T12:33:30.8996726Z I1204 12:31:33.521000 459379 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459449 2025-12-04T12:33:30.8997423Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.8998010Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.8998630Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.8999218Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.8999608Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.8999976Z return func(*args, **kwargs) 2025-12-04T12:33:30.9000372Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9000738Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9001095Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9001452Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9001798Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9002137Z seq = FSDP( 2025-12-04T12:33:30.9002454Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9002789Z seq = FSDP( 2025-12-04T12:33:30.9004121Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9005580Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9007019Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9008485Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9008794Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9009138Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9009633Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9010117Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9010631Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9011088Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9011532Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9012001Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9012472Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9012937Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9013416Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9013886Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9014348Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9014819Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9015474Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:33:30.9016083Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9016438Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9017012Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.9017505Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9017876Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9018337Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.9018581Z dist init r=1, world=2 2025-12-04T12:33:30.9018789Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9019131Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9019653Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9020139Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9020620Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9021073Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9021514Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9021981Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9022446Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9022941Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9023409Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9023864Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9024323Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9024791Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9025438Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:33:30.9026043Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9026395Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9026970Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.9027459Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9027825Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9028289Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.9028532Z dist init r=0, world=2 2025-12-04T12:33:30.9028970Z [rank0]:[W1204 12:31:40.338955905 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.9029386Z FAILED [8.7113s] [100%] 2025-12-04T12:33:30.9029453Z 2025-12-04T12:33:30.9029511Z =================================== FAILURES =================================== 2025-12-04T12:33:30.9029702Z ____________ TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda _____________ 2025-12-04T12:33:30.9029878Z Traceback (most recent call last): 2025-12-04T12:33:30.9030125Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.9030369Z self._join_processes(fn) 2025-12-04T12:33:30.9030616Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.9030881Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.9031154Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.9031432Z raise RuntimeError(error) 2025-12-04T12:33:30.9031586Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9031775Z Traceback (most recent call last): 2025-12-04T12:33:30.9032017Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9032260Z getattr(self, test_name)() 2025-12-04T12:33:30.9032493Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9032726Z fn() 2025-12-04T12:33:30.9032937Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9033169Z method(*args, **kwargs) 2025-12-04T12:33:30.9033395Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9033625Z method(*args, **kwargs) 2025-12-04T12:33:30.9033846Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9034074Z with policy(): 2025-12-04T12:33:30.9034287Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9034518Z raise RuntimeError(msg) 2025-12-04T12:33:30.9034920Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:33:30.9035283Z 2025-12-04T12:33:30.9035358Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9035682Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.9035931Z 2025-12-04T12:33:30.9036020Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9036149Z 2025-12-04T12:33:30.9036208Z Process 1 exited with error code 10 and exception: 2025-12-04T12:33:30.9036350Z Traceback (most recent call last): 2025-12-04T12:33:30.9036594Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9036837Z getattr(self, test_name)() 2025-12-04T12:33:30.9037071Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9046343Z fn() 2025-12-04T12:33:30.9046583Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9046892Z method(*args, **kwargs) 2025-12-04T12:33:30.9047122Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9047360Z method(*args, **kwargs) 2025-12-04T12:33:30.9047587Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9047818Z with policy(): 2025-12-04T12:33:30.9048036Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9048323Z raise RuntimeError(msg) 2025-12-04T12:33:30.9048729Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:33:30.9049097Z 2025-12-04T12:33:30.9049182Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9049508Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.9049790Z 2025-12-04T12:33:30.9049886Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9050015Z 2025-12-04T12:33:30.9050016Z 2025-12-04T12:33:30.9050103Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.9050309Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.9050691Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-cdaf34baac2ba9f9.xml - 2025-12-04T12:33:30.9051045Z =========================== short test summary info ============================ 2025-12-04T12:33:30.9051391Z FAILED [8.7113s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9051715Z Traceback (most recent call last): 2025-12-04T12:33:30.9051968Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9052218Z getattr(self, test_name)() 2025-12-04T12:33:30.9052459Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9052696Z fn() 2025-12-04T12:33:30.9052906Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9053142Z method(*args, **kwargs) 2025-12-04T12:33:30.9053364Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9053598Z method(*args, **kwargs) 2025-12-04T12:33:30.9053819Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9054051Z with policy(): 2025-12-04T12:33:30.9054265Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9054498Z raise RuntimeError(msg) 2025-12-04T12:33:30.9054902Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:33:30.9055267Z 2025-12-04T12:33:30.9055342Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9055701Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.9055952Z 2025-12-04T12:33:30.9056045Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9056174Z 2025-12-04T12:33:30.9056234Z Process 1 exited with error code 10 and exception: 2025-12-04T12:33:30.9056382Z Traceback (most recent call last): 2025-12-04T12:33:30.9056628Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9056875Z getattr(self, test_name)() 2025-12-04T12:33:30.9057111Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9057347Z fn() 2025-12-04T12:33:30.9057551Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9057781Z method(*args, **kwargs) 2025-12-04T12:33:30.9058010Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9058290Z method(*args, **kwargs) 2025-12-04T12:33:30.9058511Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9058759Z with policy(): 2025-12-04T12:33:30.9058973Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9059207Z raise RuntimeError(msg) 2025-12-04T12:33:30.9059607Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:33:30.9059973Z 2025-12-04T12:33:30.9060048Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9060375Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.9060623Z 2025-12-04T12:33:30.9060715Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9060908Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.9061077Z ======================= 1 failed, 3 deselected in 8.72s ======================== 2025-12-04T12:33:30.9061220Z Got exit code 1 2025-12-04T12:33:30.9061441Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda 2025-12-04T12:33:30.9061767Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:33:30.9062142Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-62a07bc624719721.xml 2025-12-04T12:33:30.9062447Z ============================= test session starts ============================== 2025-12-04T12:33:30.9062670Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.9062868Z cachedir: .pytest_cache 2025-12-04T12:33:30.9063098Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.9063341Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.9063466Z configfile: pytest.ini 2025-12-04T12:33:30.9063698Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.9063973Z collecting ... collected 4 items / 1 deselected / 3 selected 2025-12-04T12:33:30.9064135Z stepcurrent: skipping 1 already run items. 2025-12-04T12:33:30.9064270Z Running 3 items in this shard 2025-12-04T12:33:30.9064344Z 2025-12-04T12:33:30.9064678Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda I1204 12:31:44.481000 459615 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459684 2025-12-04T12:33:30.9065165Z I1204 12:31:44.482000 459615 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459685 2025-12-04T12:33:30.9065864Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9066458Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9067048Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9067661Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9068059Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.9068472Z return func(*args, **kwargs) 2025-12-04T12:33:30.9068832Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9069270Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9069632Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9069997Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9070346Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9070692Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9071017Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9071361Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9072711Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9074142Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9075610Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9077043Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9077378Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9077741Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9078286Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9078775Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9079263Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9079723Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9080172Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9080644Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9081114Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9081589Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9082058Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9082517Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9082981Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9083451Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9084137Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30208 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632. 2025-12-04T12:33:30.9084754Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9085113Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9085690Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9086180Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9086553Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9086988Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.9087247Z dist init r=1, world=2 2025-12-04T12:33:30.9087456Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9087800Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9088334Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9088831Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9089315Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9089769Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9090211Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9090680Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9091149Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9091616Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9092082Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9092537Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9092996Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9093498Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9094151Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:33:30.9094761Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9095113Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9095689Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9096194Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9096580Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9096995Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.9097238Z dist init r=0, world=2 2025-12-04T12:33:30.9097644Z [rank0]:[W1204 12:31:52.601311367 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.9098061Z FAILED [10.0141s] [ 33%] 2025-12-04T12:33:30.9098132Z 2025-12-04T12:33:30.9098245Z =================================== FAILURES =================================== 2025-12-04T12:33:30.9098440Z _____________ TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda _____________ 2025-12-04T12:33:30.9098621Z Traceback (most recent call last): 2025-12-04T12:33:30.9098873Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.9099122Z self._join_processes(fn) 2025-12-04T12:33:30.9099374Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.9099641Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.9099916Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.9100177Z raise RuntimeError(error) 2025-12-04T12:33:30.9100332Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:33:30.9100494Z Traceback (most recent call last): 2025-12-04T12:33:30.9100734Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9100975Z getattr(self, test_name)() 2025-12-04T12:33:30.9101205Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9101435Z fn() 2025-12-04T12:33:30.9101636Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9101865Z method(*args, **kwargs) 2025-12-04T12:33:30.9102085Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9102313Z method(*args, **kwargs) 2025-12-04T12:33:30.9102567Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9102793Z with policy(): 2025-12-04T12:33:30.9103003Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9103233Z raise RuntimeError(msg) 2025-12-04T12:33:30.9103626Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30208 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632. 2025-12-04T12:33:30.9103986Z 2025-12-04T12:33:30.9104063Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9104383Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9104626Z 2025-12-04T12:33:30.9104722Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9104861Z 2025-12-04T12:33:30.9104863Z 2025-12-04T12:33:30.9104944Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.9105163Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.9105535Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-62a07bc624719721.xml - 2025-12-04T12:33:30.9105876Z =========================== short test summary info ============================ 2025-12-04T12:33:30.9106207Z FAILED [10.0141s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:33:30.9106517Z Traceback (most recent call last): 2025-12-04T12:33:30.9106768Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9107013Z getattr(self, test_name)() 2025-12-04T12:33:30.9107246Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9107479Z fn() 2025-12-04T12:33:30.9107679Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9107909Z method(*args, **kwargs) 2025-12-04T12:33:30.9108127Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9108408Z method(*args, **kwargs) 2025-12-04T12:33:30.9108628Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9108853Z with policy(): 2025-12-04T12:33:30.9109067Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9109298Z raise RuntimeError(msg) 2025-12-04T12:33:30.9109695Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30208 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632. 2025-12-04T12:33:30.9110061Z 2025-12-04T12:33:30.9110135Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9110455Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9110698Z 2025-12-04T12:33:30.9110787Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9111013Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.9111181Z ======================= 1 failed, 1 deselected in 10.02s ======================= 2025-12-04T12:33:30.9111320Z Got exit code 1 2025-12-04T12:33:30.9111417Z Retrying single test... 2025-12-04T12:33:30.9111687Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-fa7737fbb8bb2551.xml 2025-12-04T12:33:30.9111985Z ============================= test session starts ============================== 2025-12-04T12:33:30.9112198Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.9112385Z cachedir: .pytest_cache 2025-12-04T12:33:30.9112609Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.9112848Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.9112968Z configfile: pytest.ini 2025-12-04T12:33:30.9113198Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.9113487Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:33:30.9113796Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9114094Z Running 1 items in this shard 2025-12-04T12:33:30.9114169Z 2025-12-04T12:33:30.9114462Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda I1204 12:31:56.790000 459851 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459920 2025-12-04T12:33:30.9114941Z I1204 12:31:56.790000 459851 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459921 2025-12-04T12:33:30.9115636Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9116228Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9116814Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9117401Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9117796Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.9118212Z return func(*args, **kwargs) 2025-12-04T12:33:30.9118570Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9118931Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9119288Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9119644Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9119991Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9120360Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9120684Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9121022Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9122366Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9123825Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9125262Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9126677Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9126984Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9127328Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9127825Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9128354Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9128838Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9129289Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9129734Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9130232Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9130704Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9131170Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9131635Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9132090Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9132548Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9133028Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9133690Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:33:30.9134295Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9134650Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9135224Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9135715Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9136083Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9136497Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.9136739Z dist init r=0, world=2 2025-12-04T12:33:30.9136944Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9137284Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9137773Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9138304Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9138783Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9139235Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9139713Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9140181Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9140647Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9141113Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9141580Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9142035Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9142506Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9142986Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9143629Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30208 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632. 2025-12-04T12:33:30.9144232Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9144586Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9145159Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9145644Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9146009Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9146425Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.9146668Z dist init r=1, world=2 2025-12-04T12:33:30.9147070Z [rank0]:[W1204 12:32:05.787910829 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.9147483Z FAILED [9.9121s] [100%] 2025-12-04T12:33:30.9147547Z 2025-12-04T12:33:30.9147608Z =================================== FAILURES =================================== 2025-12-04T12:33:30.9147798Z _____________ TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda _____________ 2025-12-04T12:33:30.9147974Z Traceback (most recent call last): 2025-12-04T12:33:30.9148359Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.9148626Z self._join_processes(fn) 2025-12-04T12:33:30.9148948Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.9149228Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.9149503Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.9149763Z raise RuntimeError(error) 2025-12-04T12:33:30.9149916Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9150080Z Traceback (most recent call last): 2025-12-04T12:33:30.9150324Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9150566Z getattr(self, test_name)() 2025-12-04T12:33:30.9150799Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9151036Z fn() 2025-12-04T12:33:30.9151238Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9151487Z method(*args, **kwargs) 2025-12-04T12:33:30.9151709Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9151955Z method(*args, **kwargs) 2025-12-04T12:33:30.9152175Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9152403Z with policy(): 2025-12-04T12:33:30.9152617Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9152850Z raise RuntimeError(msg) 2025-12-04T12:33:30.9153253Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:33:30.9153614Z 2025-12-04T12:33:30.9153690Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9154012Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9154256Z 2025-12-04T12:33:30.9154346Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9154470Z 2025-12-04T12:33:30.9154472Z 2025-12-04T12:33:30.9154554Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.9154757Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.9155135Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-fa7737fbb8bb2551.xml - 2025-12-04T12:33:30.9155480Z =========================== short test summary info ============================ 2025-12-04T12:33:30.9155810Z FAILED [9.9121s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9156119Z Traceback (most recent call last): 2025-12-04T12:33:30.9156364Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9156608Z getattr(self, test_name)() 2025-12-04T12:33:30.9156841Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9157075Z fn() 2025-12-04T12:33:30.9157278Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9157535Z method(*args, **kwargs) 2025-12-04T12:33:30.9157761Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9157992Z method(*args, **kwargs) 2025-12-04T12:33:30.9158255Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9158485Z with policy(): 2025-12-04T12:33:30.9158698Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9158928Z raise RuntimeError(msg) 2025-12-04T12:33:30.9159325Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:33:30.9159686Z 2025-12-04T12:33:30.9159766Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9160088Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9160378Z 2025-12-04T12:33:30.9160466Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9160655Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.9160821Z ======================= 1 failed, 3 deselected in 9.92s ======================== 2025-12-04T12:33:30.9160957Z Got exit code 1 2025-12-04T12:33:30.9161054Z Retrying single test... 2025-12-04T12:33:30.9161320Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-236a181fc18f35dc.xml 2025-12-04T12:33:30.9161616Z ============================= test session starts ============================== 2025-12-04T12:33:30.9161828Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.9162016Z cachedir: .pytest_cache 2025-12-04T12:33:30.9162238Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.9162477Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.9162595Z configfile: pytest.ini 2025-12-04T12:33:30.9162822Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.9163091Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:33:30.9163399Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9163678Z Running 1 items in this shard 2025-12-04T12:33:30.9163752Z 2025-12-04T12:33:30.9164044Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda I1204 12:32:09.016000 460087 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 460156 2025-12-04T12:33:30.9164522Z I1204 12:32:09.017000 460087 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 460157 2025-12-04T12:33:30.9165213Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9165799Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9166414Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9167004Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9167403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.9167770Z return func(*args, **kwargs) 2025-12-04T12:33:30.9168127Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9168532Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9168890Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9169262Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9169620Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9169959Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9170281Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9170615Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9171950Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9173376Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9174811Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9176262Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9176567Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9176910Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9177401Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9177882Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9178402Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9178852Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9179310Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9179792Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9180259Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9180722Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9181186Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9181640Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9182094Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9182559Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9183203Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:33:30.9183806Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9184157Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9184728Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9185212Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9185614Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9186030Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.9186272Z dist init r=0, world=2 2025-12-04T12:33:30.9186475Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9186813Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9187298Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9187776Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9188294Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9188773Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9189210Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9189673Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9190138Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9190602Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9191066Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9191519Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9191975Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9192443Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9193085Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632. 2025-12-04T12:33:30.9193686Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9194035Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9194636Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9195118Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9195489Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9195904Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.9196143Z dist init r=1, world=2 2025-12-04T12:33:30.9196544Z [rank0]:[W1204 12:32:17.089848995 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.9196953Z FAILED [10.0126s] [100%] 2025-12-04T12:33:30.9197020Z 2025-12-04T12:33:30.9197082Z =================================== FAILURES =================================== 2025-12-04T12:33:30.9197282Z _____________ TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda _____________ 2025-12-04T12:33:30.9197456Z Traceback (most recent call last): 2025-12-04T12:33:30.9197712Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.9197954Z self._join_processes(fn) 2025-12-04T12:33:30.9198238Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.9198501Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.9198767Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.9199024Z raise RuntimeError(error) 2025-12-04T12:33:30.9199179Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9199340Z Traceback (most recent call last): 2025-12-04T12:33:30.9199580Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9199821Z getattr(self, test_name)() 2025-12-04T12:33:30.9200053Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9200282Z fn() 2025-12-04T12:33:30.9200484Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9200713Z method(*args, **kwargs) 2025-12-04T12:33:30.9200932Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9201201Z method(*args, **kwargs) 2025-12-04T12:33:30.9201425Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9201651Z with policy(): 2025-12-04T12:33:30.9201862Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9202094Z raise RuntimeError(msg) 2025-12-04T12:33:30.9202491Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:33:30.9202850Z 2025-12-04T12:33:30.9202926Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9203244Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9203487Z 2025-12-04T12:33:30.9203614Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9203740Z 2025-12-04T12:33:30.9203744Z 2025-12-04T12:33:30.9203821Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.9204022Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.9204392Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-236a181fc18f35dc.xml - 2025-12-04T12:33:30.9204737Z =========================== short test summary info ============================ 2025-12-04T12:33:30.9205067Z FAILED [10.0126s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9205375Z Traceback (most recent call last): 2025-12-04T12:33:30.9205625Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9205866Z getattr(self, test_name)() 2025-12-04T12:33:30.9206113Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9206359Z fn() 2025-12-04T12:33:30.9206558Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9206786Z method(*args, **kwargs) 2025-12-04T12:33:30.9207005Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9207233Z method(*args, **kwargs) 2025-12-04T12:33:30.9207453Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9207676Z with policy(): 2025-12-04T12:33:30.9207891Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9208121Z raise RuntimeError(msg) 2025-12-04T12:33:30.9208555Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:33:30.9208917Z 2025-12-04T12:33:30.9208991Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9209308Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9209555Z 2025-12-04T12:33:30.9209642Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9209828Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.9209994Z ======================= 1 failed, 3 deselected in 10.02s ======================= 2025-12-04T12:33:30.9210130Z Got exit code 1 2025-12-04T12:33:30.9210345Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda 2025-12-04T12:33:30.9210663Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:33:30.9211031Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-65ad0c2371b7284d.xml 2025-12-04T12:33:30.9211325Z ============================= test session starts ============================== 2025-12-04T12:33:30.9211535Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.9211724Z cachedir: .pytest_cache 2025-12-04T12:33:30.9211947Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.9212221Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.9212339Z configfile: pytest.ini 2025-12-04T12:33:30.9212565Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.9212834Z collecting ... collected 4 items / 2 deselected / 2 selected 2025-12-04T12:33:30.9212993Z stepcurrent: skipping 2 already run items. 2025-12-04T12:33:30.9213122Z Running 2 items in this shard 2025-12-04T12:33:30.9213195Z 2025-12-04T12:33:30.9213478Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda I1204 12:32:21.285000 460323 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 460392 2025-12-04T12:33:30.9213944Z I1204 12:32:21.286000 460323 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 460393 2025-12-04T12:33:30.9214636Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9215258Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9215848Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9216431Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9216821Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.9217190Z return func(*args, **kwargs) 2025-12-04T12:33:30.9217544Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9217902Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9218294Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9218648Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9218991Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9219328Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9219649Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9219985Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9221376Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9222807Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9224238Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9225679Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9225985Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9226329Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9226822Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9227307Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9227788Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9228322Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9228766Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9229233Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9229697Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9230164Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9230627Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9231082Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9231575Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9232042Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9232677Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:33:30.9233269Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9233621Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9234181Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9234690Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9235056Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9235476Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.9235717Z dist init r=1, world=2 2025-12-04T12:33:30.9235926Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9236264Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9236751Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9237231Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9237710Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9238214Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9238655Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9239120Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9239584Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9240045Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9240539Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9240993Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9241452Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9241916Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9242547Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:33:30.9243140Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9243504Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9244083Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9244560Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9244925Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9245339Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.9245578Z dist init r=0, world=2 2025-12-04T12:33:30.9245981Z [rank0]:[W1204 12:32:28.408349308 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.9246391Z FAILED [8.8114s] [ 50%] 2025-12-04T12:33:30.9246456Z 2025-12-04T12:33:30.9246516Z =================================== FAILURES =================================== 2025-12-04T12:33:30.9246703Z ________________ TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda ________________ 2025-12-04T12:33:30.9246877Z Traceback (most recent call last): 2025-12-04T12:33:30.9247123Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.9247371Z self._join_processes(fn) 2025-12-04T12:33:30.9247619Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.9247884Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.9248189Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.9248452Z raise RuntimeError(error) 2025-12-04T12:33:30.9248606Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:33:30.9248770Z Traceback (most recent call last): 2025-12-04T12:33:30.9249010Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9249252Z getattr(self, test_name)() 2025-12-04T12:33:30.9249522Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9249754Z fn() 2025-12-04T12:33:30.9249957Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9250190Z method(*args, **kwargs) 2025-12-04T12:33:30.9250411Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9250640Z method(*args, **kwargs) 2025-12-04T12:33:30.9250858Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9251085Z with policy(): 2025-12-04T12:33:30.9251297Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9251526Z raise RuntimeError(msg) 2025-12-04T12:33:30.9251912Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:33:30.9252277Z 2025-12-04T12:33:30.9252353Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9252677Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9252915Z 2025-12-04T12:33:30.9253003Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9253129Z 2025-12-04T12:33:30.9253131Z 2025-12-04T12:33:30.9253208Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.9253409Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.9253781Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-65ad0c2371b7284d.xml - 2025-12-04T12:33:30.9254122Z =========================== short test summary info ============================ 2025-12-04T12:33:30.9254438Z FAILED [8.8114s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:33:30.9254734Z Traceback (most recent call last): 2025-12-04T12:33:30.9254977Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9255220Z getattr(self, test_name)() 2025-12-04T12:33:30.9255453Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9255684Z fn() 2025-12-04T12:33:30.9255883Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9256113Z method(*args, **kwargs) 2025-12-04T12:33:30.9256333Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9256560Z method(*args, **kwargs) 2025-12-04T12:33:30.9256779Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9257004Z with policy(): 2025-12-04T12:33:30.9257213Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9257444Z raise RuntimeError(msg) 2025-12-04T12:33:30.9257831Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:33:30.9258221Z 2025-12-04T12:33:30.9258324Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9258634Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9258870Z 2025-12-04T12:33:30.9258961Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9259148Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.9259313Z ======================= 1 failed, 2 deselected in 8.82s ======================== 2025-12-04T12:33:30.9259450Z Got exit code 1 2025-12-04T12:33:30.9259545Z Retrying single test... 2025-12-04T12:33:30.9259813Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-809e0096656d718a.xml 2025-12-04T12:33:30.9260109Z ============================= test session starts ============================== 2025-12-04T12:33:30.9260321Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.9260525Z cachedir: .pytest_cache 2025-12-04T12:33:30.9260747Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.9261002Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.9261119Z configfile: pytest.ini 2025-12-04T12:33:30.9261347Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.9261619Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:33:30.9261923Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda 2025-12-04T12:33:30.9262194Z Running 1 items in this shard 2025-12-04T12:33:30.9262266Z 2025-12-04T12:33:30.9262551Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda I1204 12:32:32.668000 460559 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 460628 2025-12-04T12:33:30.9263021Z I1204 12:32:32.669000 460559 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 460629 2025-12-04T12:33:30.9263712Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9264299Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9264882Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9265468Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9265858Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.9266224Z return func(*args, **kwargs) 2025-12-04T12:33:30.9266580Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9266938Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9267315Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9267674Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9268021Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9268410Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9268733Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9269073Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9270413Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9271877Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9273315Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9274740Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9275045Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9275389Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9275880Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9276361Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9276873Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9277324Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9277767Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9278286Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9278753Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9279216Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9279683Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9280148Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9280621Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9281089Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9281728Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 18432 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:33:30.9282321Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9282671Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9283233Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9283713Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9284080Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9284495Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.9284738Z dist init r=1, world=2 2025-12-04T12:33:30.9284942Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9285279Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9285765Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9286281Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9286763Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9287216Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9287656Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9288122Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9288614Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9289077Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9289557Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9290022Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9290478Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9290948Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9291583Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:33:30.9292179Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9292531Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9293095Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9293570Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9293941Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9294357Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.9294602Z dist init r=0, world=2 2025-12-04T12:33:30.9295002Z [rank0]:[W1204 12:32:39.632374026 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.9295046Z FAILED [9.0118s] [100%] 2025-12-04T12:33:30.9295048Z 2025-12-04T12:33:30.9295145Z =================================== FAILURES =================================== 2025-12-04T12:33:30.9295239Z ________________ TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda ________________ 2025-12-04T12:33:30.9295293Z Traceback (most recent call last): 2025-12-04T12:33:30.9295465Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.9295513Z self._join_processes(fn) 2025-12-04T12:33:30.9295696Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.9295753Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.9295941Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.9295987Z raise RuntimeError(error) 2025-12-04T12:33:30.9296076Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9296131Z Traceback (most recent call last): 2025-12-04T12:33:30.9296299Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9296357Z getattr(self, test_name)() 2025-12-04T12:33:30.9296535Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9296574Z fn() 2025-12-04T12:33:30.9296733Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9296777Z method(*args, **kwargs) 2025-12-04T12:33:30.9296934Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9296978Z method(*args, **kwargs) 2025-12-04T12:33:30.9297138Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9297180Z with policy(): 2025-12-04T12:33:30.9297340Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9297386Z raise RuntimeError(msg) 2025-12-04T12:33:30.9297711Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:33:30.9297713Z 2025-12-04T12:33:30.9297798Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9298001Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9298003Z 2025-12-04T12:33:30.9298100Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9298104Z 2025-12-04T12:33:30.9298106Z 2025-12-04T12:33:30.9298226Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.9302864Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.9303134Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-809e0096656d718a.xml - 2025-12-04T12:33:30.9303200Z =========================== short test summary info ============================ 2025-12-04T12:33:30.9303427Z FAILED [9.0118s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9303475Z Traceback (most recent call last): 2025-12-04T12:33:30.9303650Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9303752Z getattr(self, test_name)() 2025-12-04T12:33:30.9303915Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9303956Z fn() 2025-12-04T12:33:30.9304110Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9304156Z method(*args, **kwargs) 2025-12-04T12:33:30.9304309Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9304356Z method(*args, **kwargs) 2025-12-04T12:33:30.9304506Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9304547Z with policy(): 2025-12-04T12:33:30.9304702Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9304753Z raise RuntimeError(msg) 2025-12-04T12:33:30.9305075Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:33:30.9305121Z 2025-12-04T12:33:30.9305204Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9305409Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9305412Z 2025-12-04T12:33:30.9305505Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9305574Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.9305639Z ======================= 1 failed, 3 deselected in 9.02s ======================== 2025-12-04T12:33:30.9305681Z Got exit code 1 2025-12-04T12:33:30.9305723Z Retrying single test... 2025-12-04T12:33:30.9305932Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-03c4575fd3440fa8.xml 2025-12-04T12:33:30.9305994Z ============================= test session starts ============================== 2025-12-04T12:33:30.9306114Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.9306156Z cachedir: .pytest_cache 2025-12-04T12:33:30.9306321Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.9306372Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.9306420Z configfile: pytest.ini 2025-12-04T12:33:30.9306586Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.9306666Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:33:30.9306864Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda 2025-12-04T12:33:30.9306916Z Running 1 items in this shard 2025-12-04T12:33:30.9306920Z 2025-12-04T12:33:30.9307203Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda I1204 12:32:43.950000 460795 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 460864 2025-12-04T12:33:30.9307365Z I1204 12:32:43.951000 460795 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 460865 2025-12-04T12:33:30.9307894Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9307961Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9308501Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9308564Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9308863Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.9308911Z return func(*args, **kwargs) 2025-12-04T12:33:30.9309198Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9309283Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9309559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9309607Z return fsdp_fn(module, **kwargs) 2025-12-04T12:33:30.9309876Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9309919Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9310185Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:30.9310228Z fsdp_seq = FSDP( 2025-12-04T12:33:30.9311525Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9311658Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9312955Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:30.9313084Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:30.9313233Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9313402Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9313698Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9313861Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9314162Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9314307Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9314590Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9314743Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9315026Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9315177Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9315461Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9315603Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9315886Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9316041Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9316495Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:33:30.9316617Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9316817Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9317171Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9317292Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9317508Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9317678Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.9317718Z dist init r=1, world=2 2025-12-04T12:33:30.9317862Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9318026Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9318355Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9318538Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9318826Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9318955Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9319236Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9319388Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9319666Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9319820Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9320097Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9320241Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9320522Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9320676Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9321123Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:33:30.9321240Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9321467Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9321796Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9321915Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9322131Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9322297Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.9322340Z dist init r=0, world=2 2025-12-04T12:33:30.9322685Z [rank0]:[W1204 12:32:51.723339273 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.9322752Z FAILED [8.7123s] [100%] 2025-12-04T12:33:30.9322754Z 2025-12-04T12:33:30.9322813Z =================================== FAILURES =================================== 2025-12-04T12:33:30.9322908Z ________________ TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda ________________ 2025-12-04T12:33:30.9322956Z Traceback (most recent call last): 2025-12-04T12:33:30.9323123Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.9323169Z self._join_processes(fn) 2025-12-04T12:33:30.9323346Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.9323402Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.9323584Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.9323630Z raise RuntimeError(error) 2025-12-04T12:33:30.9323718Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9323764Z Traceback (most recent call last): 2025-12-04T12:33:30.9323929Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9323974Z getattr(self, test_name)() 2025-12-04T12:33:30.9324138Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9324177Z fn() 2025-12-04T12:33:30.9324330Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9324377Z method(*args, **kwargs) 2025-12-04T12:33:30.9324529Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9324573Z method(*args, **kwargs) 2025-12-04T12:33:30.9324727Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9324769Z with policy(): 2025-12-04T12:33:30.9324924Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9324969Z raise RuntimeError(msg) 2025-12-04T12:33:30.9325289Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:33:30.9325291Z 2025-12-04T12:33:30.9325396Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9325601Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9325604Z 2025-12-04T12:33:30.9325696Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9325698Z 2025-12-04T12:33:30.9325761Z Process 1 exited with error code 10 and exception: 2025-12-04T12:33:30.9325807Z Traceback (most recent call last): 2025-12-04T12:33:30.9325975Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9326018Z getattr(self, test_name)() 2025-12-04T12:33:30.9326180Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9326216Z fn() 2025-12-04T12:33:30.9326373Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9326432Z method(*args, **kwargs) 2025-12-04T12:33:30.9326586Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9326638Z method(*args, **kwargs) 2025-12-04T12:33:30.9326794Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9326832Z with policy(): 2025-12-04T12:33:30.9326989Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9327030Z raise RuntimeError(msg) 2025-12-04T12:33:30.9327355Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:33:30.9327357Z 2025-12-04T12:33:30.9327435Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9327638Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9327641Z 2025-12-04T12:33:30.9327733Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9327735Z 2025-12-04T12:33:30.9327737Z 2025-12-04T12:33:30.9327817Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.9327909Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.9328209Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-03c4575fd3440fa8.xml - 2025-12-04T12:33:30.9328277Z =========================== short test summary info ============================ 2025-12-04T12:33:30.9328496Z FAILED [8.7123s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9328548Z Traceback (most recent call last): 2025-12-04T12:33:30.9328714Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9328760Z getattr(self, test_name)() 2025-12-04T12:33:30.9328920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9328958Z fn() 2025-12-04T12:33:30.9329110Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9329153Z method(*args, **kwargs) 2025-12-04T12:33:30.9329336Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9329383Z method(*args, **kwargs) 2025-12-04T12:33:30.9329536Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9329577Z with policy(): 2025-12-04T12:33:30.9329733Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9329775Z raise RuntimeError(msg) 2025-12-04T12:33:30.9330094Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:33:30.9330096Z 2025-12-04T12:33:30.9330171Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9330375Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9330390Z 2025-12-04T12:33:30.9330479Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9330494Z 2025-12-04T12:33:30.9330556Z Process 1 exited with error code 10 and exception: 2025-12-04T12:33:30.9330602Z Traceback (most recent call last): 2025-12-04T12:33:30.9330770Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9330813Z getattr(self, test_name)() 2025-12-04T12:33:30.9330976Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9331011Z fn() 2025-12-04T12:33:30.9331168Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9331209Z method(*args, **kwargs) 2025-12-04T12:33:30.9331364Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9331408Z method(*args, **kwargs) 2025-12-04T12:33:30.9331560Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9331599Z with policy(): 2025-12-04T12:33:30.9331752Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9331796Z raise RuntimeError(msg) 2025-12-04T12:33:30.9332112Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:33:30.9332114Z 2025-12-04T12:33:30.9332191Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9332390Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:33:30.9332393Z 2025-12-04T12:33:30.9332485Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9332550Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.9332617Z ======================= 1 failed, 3 deselected in 8.72s ======================== 2025-12-04T12:33:30.9332655Z Got exit code 1 2025-12-04T12:33:30.9332810Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda 2025-12-04T12:33:30.9332940Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:33:30.9333174Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-5b5895a03e1f67ac.xml 2025-12-04T12:33:30.9333240Z ============================= test session starts ============================== 2025-12-04T12:33:30.9333354Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.9333400Z cachedir: .pytest_cache 2025-12-04T12:33:30.9333561Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.9333611Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.9333653Z configfile: pytest.ini 2025-12-04T12:33:30.9333822Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.9333896Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:33:30.9333955Z stepcurrent: skipping 3 already run items. 2025-12-04T12:33:30.9334004Z Running 1 items in this shard 2025-12-04T12:33:30.9334006Z 2025-12-04T12:33:30.9334308Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda I1204 12:32:54.981000 461031 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 461100 2025-12-04T12:33:30.9334485Z I1204 12:32:54.982000 461031 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 461101 2025-12-04T12:33:30.9334986Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9335052Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9335543Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9335609Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9335902Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.9335948Z return func(*args, **kwargs) 2025-12-04T12:33:30.9336095Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9336263Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9336557Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9336718Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9337007Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9337135Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9337437Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9337588Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9337870Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9338019Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9338337Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9338478Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9338776Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9338942Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9339409Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136. 2025-12-04T12:33:30.9339532Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9339729Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9340081Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9340200Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9340413Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9340584Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.9340624Z dist init r=1, world=2 2025-12-04T12:33:30.9340766Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9340928Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9341222Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9341378Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9341698Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9341829Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9342108Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9342260Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9342539Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9342689Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9342968Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9343129Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9343414Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9343564Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9344029Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:33:30.9344147Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9344347Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9344696Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9344813Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9345030Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9345197Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.9345241Z dist init r=0, world=2 2025-12-04T12:33:30.9345580Z [rank0]:[W1204 12:33:03.666204771 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.9345623Z FAILED [9.5118s] [100%] 2025-12-04T12:33:30.9345625Z 2025-12-04T12:33:30.9345682Z =================================== FAILURES =================================== 2025-12-04T12:33:30.9345781Z __________ TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda __________ 2025-12-04T12:33:30.9345848Z Traceback (most recent call last): 2025-12-04T12:33:30.9346015Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.9346060Z self._join_processes(fn) 2025-12-04T12:33:30.9346237Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.9346293Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.9346476Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.9346523Z raise RuntimeError(error) 2025-12-04T12:33:30.9346606Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9346654Z Traceback (most recent call last): 2025-12-04T12:33:30.9346817Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9346864Z getattr(self, test_name)() 2025-12-04T12:33:30.9347023Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9347072Z fn() 2025-12-04T12:33:30.9347236Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9347278Z method(*args, **kwargs) 2025-12-04T12:33:30.9347429Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9347472Z method(*args, **kwargs) 2025-12-04T12:33:30.9347623Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9347666Z with policy(): 2025-12-04T12:33:30.9347818Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9347861Z raise RuntimeError(msg) 2025-12-04T12:33:30.9348226Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:33:30.9348230Z 2025-12-04T12:33:30.9348306Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9348522Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9348525Z 2025-12-04T12:33:30.9348615Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9348617Z 2025-12-04T12:33:30.9348618Z 2025-12-04T12:33:30.9348696Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.9348786Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.9349035Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-5b5895a03e1f67ac.xml - 2025-12-04T12:33:30.9349097Z =========================== short test summary info ============================ 2025-12-04T12:33:30.9349331Z FAILED [9.5118s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9349377Z Traceback (most recent call last): 2025-12-04T12:33:30.9349543Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9349585Z getattr(self, test_name)() 2025-12-04T12:33:30.9349773Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9349807Z fn() 2025-12-04T12:33:30.9349959Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9350001Z method(*args, **kwargs) 2025-12-04T12:33:30.9350155Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9350195Z method(*args, **kwargs) 2025-12-04T12:33:30.9350348Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9350384Z with policy(): 2025-12-04T12:33:30.9350537Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9350577Z raise RuntimeError(msg) 2025-12-04T12:33:30.9350912Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:33:30.9350928Z 2025-12-04T12:33:30.9351004Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9351238Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9351240Z 2025-12-04T12:33:30.9351328Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9351391Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.9351457Z ======================= 1 failed, 3 deselected in 9.52s ======================== 2025-12-04T12:33:30.9351493Z Got exit code 1 2025-12-04T12:33:30.9351534Z Retrying single test... 2025-12-04T12:33:30.9351740Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-8465ced8e9a91520.xml 2025-12-04T12:33:30.9351801Z ============================= test session starts ============================== 2025-12-04T12:33:30.9351913Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.9351957Z cachedir: .pytest_cache 2025-12-04T12:33:30.9352115Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.9352162Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.9352202Z configfile: pytest.ini 2025-12-04T12:33:30.9352366Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.9352438Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:33:30.9352652Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9352700Z Running 1 items in this shard 2025-12-04T12:33:30.9352702Z 2025-12-04T12:33:30.9352995Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda I1204 12:33:06.768000 461267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 461336 2025-12-04T12:33:30.9353151Z I1204 12:33:06.769000 461267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 461337 2025-12-04T12:33:30.9353667Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9353730Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9354219Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9354281Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9354578Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.9354625Z return func(*args, **kwargs) 2025-12-04T12:33:30.9354780Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9354947Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9355256Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9355425Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9355718Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9355849Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9356134Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9356293Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9356572Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9356726Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9357007Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9357153Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9357435Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9357591Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9358081Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:33:30.9358228Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9358428Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9358775Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9358895Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9359110Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9359278Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.9359338Z dist init r=0, world=2 2025-12-04T12:33:30.9359480Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9359661Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9359952Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9360115Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9360405Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9360536Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9360816Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9360967Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9361248Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9361399Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9361680Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9361820Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9362102Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9362252Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9362739Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136. 2025-12-04T12:33:30.9362859Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9363056Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9363402Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9363519Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9363733Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9363922Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.9363962Z dist init r=1, world=2 2025-12-04T12:33:30.9364303Z [rank0]:[W1204 12:33:14.375356835 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.9364343Z FAILED [9.4107s] [100%] 2025-12-04T12:33:30.9364345Z 2025-12-04T12:33:30.9364405Z =================================== FAILURES =================================== 2025-12-04T12:33:30.9364504Z __________ TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda __________ 2025-12-04T12:33:30.9364554Z Traceback (most recent call last): 2025-12-04T12:33:30.9364718Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.9364766Z self._join_processes(fn) 2025-12-04T12:33:30.9364940Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.9364997Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.9365176Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.9365223Z raise RuntimeError(error) 2025-12-04T12:33:30.9365304Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9365353Z Traceback (most recent call last): 2025-12-04T12:33:30.9365517Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9365562Z getattr(self, test_name)() 2025-12-04T12:33:30.9365721Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9365759Z fn() 2025-12-04T12:33:30.9365911Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9365955Z method(*args, **kwargs) 2025-12-04T12:33:30.9366109Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9366151Z method(*args, **kwargs) 2025-12-04T12:33:30.9366305Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9366342Z with policy(): 2025-12-04T12:33:30.9366524Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9366567Z raise RuntimeError(msg) 2025-12-04T12:33:30.9366902Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:33:30.9366905Z 2025-12-04T12:33:30.9366981Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9367200Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9367203Z 2025-12-04T12:33:30.9367292Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9367294Z 2025-12-04T12:33:30.9367296Z 2025-12-04T12:33:30.9367376Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.9367480Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.9367727Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-8465ced8e9a91520.xml - 2025-12-04T12:33:30.9367801Z =========================== short test summary info ============================ 2025-12-04T12:33:30.9368036Z FAILED [9.4107s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9368085Z Traceback (most recent call last): 2025-12-04T12:33:30.9368272Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9368317Z getattr(self, test_name)() 2025-12-04T12:33:30.9368483Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9368524Z fn() 2025-12-04T12:33:30.9368677Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9368721Z method(*args, **kwargs) 2025-12-04T12:33:30.9368873Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9368916Z method(*args, **kwargs) 2025-12-04T12:33:30.9369067Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9369110Z with policy(): 2025-12-04T12:33:30.9369264Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9369309Z raise RuntimeError(msg) 2025-12-04T12:33:30.9369644Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:33:30.9369651Z 2025-12-04T12:33:30.9369726Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9369947Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9369949Z 2025-12-04T12:33:30.9370038Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9370106Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.9370168Z ======================= 1 failed, 3 deselected in 9.42s ======================== 2025-12-04T12:33:30.9370209Z Got exit code 1 2025-12-04T12:33:30.9370288Z Retrying single test... 2025-12-04T12:33:30.9370495Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-33077fddbd7467fc.xml 2025-12-04T12:33:30.9370555Z ============================= test session starts ============================== 2025-12-04T12:33:30.9370672Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.9370715Z cachedir: .pytest_cache 2025-12-04T12:33:30.9370877Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.9370924Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.9370966Z configfile: pytest.ini 2025-12-04T12:33:30.9371133Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.9371210Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:33:30.9371424Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9371484Z Running 1 items in this shard 2025-12-04T12:33:30.9371507Z 2025-12-04T12:33:30.9371802Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda I1204 12:33:18.512000 461503 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 461572 2025-12-04T12:33:30.9371960Z I1204 12:33:18.513000 461503 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 461573 2025-12-04T12:33:30.9372463Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9372526Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9373019Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:30.9373083Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:30.9373376Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:30.9373424Z return func(*args, **kwargs) 2025-12-04T12:33:30.9373568Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9373734Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9374027Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9374186Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9374473Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9374624Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9374906Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9375057Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9375336Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9375484Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9375766Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9375914Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9376210Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9376363Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9376826Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136. 2025-12-04T12:33:30.9376950Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9377148Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9377497Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9377615Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9377831Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9378001Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:30.9378042Z dist init r=1, world=2 2025-12-04T12:33:30.9378221Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:30.9378383Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:30.9378674Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9378854Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:30.9379146Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9379275Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:30.9379557Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9379709Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9379987Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9380138Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:30.9380428Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9380587Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:30.9380868Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9381021Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:30.9381484Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:33:30.9381602Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9381802Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9382150Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9382267Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:30.9382480Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9382649Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:30.9382690Z dist init r=0, world=2 2025-12-04T12:33:30.9383026Z [rank0]:[W1204 12:33:26.082744436 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:30.9383068Z FAILED [9.4126s] [100%] 2025-12-04T12:33:30.9383070Z 2025-12-04T12:33:30.9383147Z =================================== FAILURES =================================== 2025-12-04T12:33:30.9383245Z __________ TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda __________ 2025-12-04T12:33:30.9383293Z Traceback (most recent call last): 2025-12-04T12:33:30.9383461Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:30.9383505Z self._join_processes(fn) 2025-12-04T12:33:30.9383682Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:30.9383737Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:30.9383921Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:30.9383964Z raise RuntimeError(error) 2025-12-04T12:33:30.9384050Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9384095Z Traceback (most recent call last): 2025-12-04T12:33:30.9384260Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9384316Z getattr(self, test_name)() 2025-12-04T12:33:30.9384495Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9384530Z fn() 2025-12-04T12:33:30.9384686Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9384729Z method(*args, **kwargs) 2025-12-04T12:33:30.9384885Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9384929Z method(*args, **kwargs) 2025-12-04T12:33:30.9385083Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9385124Z with policy(): 2025-12-04T12:33:30.9385278Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9385323Z raise RuntimeError(msg) 2025-12-04T12:33:30.9385655Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:33:30.9385658Z 2025-12-04T12:33:30.9385736Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9385954Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9385956Z 2025-12-04T12:33:30.9386052Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9386054Z 2025-12-04T12:33:30.9386056Z 2025-12-04T12:33:30.9386131Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:30.9386225Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:30.9386476Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-33077fddbd7467fc.xml - 2025-12-04T12:33:30.9386539Z =========================== short test summary info ============================ 2025-12-04T12:33:30.9386775Z FAILED [9.4126s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:33:30.9386823Z Traceback (most recent call last): 2025-12-04T12:33:30.9387014Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:30.9387058Z getattr(self, test_name)() 2025-12-04T12:33:30.9387224Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:30.9387260Z fn() 2025-12-04T12:33:30.9387415Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9387455Z method(*args, **kwargs) 2025-12-04T12:33:30.9387610Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:30.9387651Z method(*args, **kwargs) 2025-12-04T12:33:30.9387804Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:30.9387842Z with policy(): 2025-12-04T12:33:30.9387999Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:30.9388042Z raise RuntimeError(msg) 2025-12-04T12:33:30.9388435Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:33:30.9388452Z 2025-12-04T12:33:30.9388530Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:30.9388747Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9388749Z 2025-12-04T12:33:30.9388843Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:30.9388909Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:30.9388977Z ======================= 1 failed, 3 deselected in 9.42s ======================== 2025-12-04T12:33:30.9389016Z Got exit code 1 2025-12-04T12:33:30.9389188Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:33:30.9389317Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:33:30.9389523Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-e112a4a560e98aab.xml 2025-12-04T12:33:30.9389582Z ============================= test session starts ============================== 2025-12-04T12:33:30.9389697Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:33:30.9389740Z cachedir: .pytest_cache 2025-12-04T12:33:30.9389905Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:30.9389954Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:30.9390000Z configfile: pytest.ini 2025-12-04T12:33:30.9390164Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:30.9390242Z collecting ... collected 4 items / 4 deselected / 0 selected 2025-12-04T12:33:30.9390300Z stepcurrent: skipping 4 already run items. 2025-12-04T12:33:30.9390346Z Running 0 items in this shard 2025-12-04T12:33:30.9390348Z 2025-12-04T12:33:30.9390599Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-e112a4a560e98aab.xml - 2025-12-04T12:33:30.9390660Z ============================ 4 deselected in 0.00s ============================= 2025-12-04T12:33:30.9391297Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda', 'test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda', 'test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda', 'test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda'] 2025-12-04T12:33:30.9391301Z 2025-12-04T12:33:30.9391498Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_fine_tune 1/1 (test/test-reports/distributed.fsdp.test_fsdp_fine_tune_1.1_f2107156872849a9_.log) 2025-12-04T12:33:30.9391505Z 2025-12-04T12:33:30.9391636Z Finished distributed/fsdp/test_fsdp_fine_tune 1/1 ... [2025-12-04 12:33:30.889189][2291109.538370674], took 2.37min 2025-12-04T12:33:30.9391904Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:33:30.9391994Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:33:30.9392096Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:33:30.9392160Z Uploading artifacts took 0.00 seconds 2025-12-04T12:33:30.9392224Z distributed/fsdp/test_fsdp_fine_tune 1/1 failed! 2025-12-04T12:33:30.9392363Z Running distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 ... [2025-12-04 12:33:30.892410][2291109.541594542] 2025-12-04T12:33:30.9392418Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:33:30.9392757Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_dtensor_state_dict.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:33:30.892589] 2025-12-04T12:42:03.7282893Z 2025-12-04T12:42:03.7283756Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 (test/test-reports/distributed.fsdp.test_fsdp_dtensor_state_dict_1.1_429921b2f227c24a_.log) 2025-12-04T12:42:03.7284754Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-129d46d21b0c8aeb.xml 2025-12-04T12:42:03.7285385Z ============================= test session starts ============================== 2025-12-04T12:42:03.7285809Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7286177Z cachedir: .pytest_cache 2025-12-04T12:42:03.7286617Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7287138Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7287394Z configfile: pytest.ini 2025-12-04T12:42:03.7287868Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7289045Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7289931Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7290750Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7291576Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7291898Z collected 15 items 2025-12-04T12:42:03.7292176Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:42:03.7299207Z Running 15 items in this shard: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.7304691Z 2025-12-04T12:42:03.7305174Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:33:32.608000 461807 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 461876 2025-12-04T12:42:03.7305887Z I1204 12:33:32.608000 461807 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 461877 2025-12-04T12:42:03.7306293Z I1204 12:33:32.609000 461807 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 461878 2025-12-04T12:42:03.7306695Z I1204 12:33:32.610000 461807 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 461879 2025-12-04T12:42:03.7307796Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7308754Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7309713Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7310594Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7311461Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7312331Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7313088Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7313872Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7315646Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7317100Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7318624Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7320048Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7321531Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7322949Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7324383Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7325857Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7326227Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7326602Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7327138Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7327608Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7328075Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7328550Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7328996Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7329519Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7329977Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7330483Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7330994Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7331560Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7332060Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7332515Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7333259Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 952107008 and is now 2843738112. 2025-12-04T12:42:03.7333990Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7334398Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7335090Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7335722Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7336140Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7336610Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7336994Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7337334Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7337815Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7338386Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7355482Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7356069Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7356510Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7356970Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7357475Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7357932Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7358462Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7358911Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7359364Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7359826Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7360579Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7361336Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7361682Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7362377Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7362989Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7363350Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7363755Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.7364096Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7364431Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7364913Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7365382Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7365845Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7366277Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7366702Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7367189Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7367640Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7368087Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7368577Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7369013Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7369454Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7369905Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7370652Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7371359Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7371695Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7372375Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7372975Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7373321Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7373716Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7374039Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7374361Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7374831Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7375295Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7375756Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7376185Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7376639Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7377087Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7377535Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7377979Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7378481Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7378918Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7379359Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7379840Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7380569Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7381261Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7381596Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7382279Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7382879Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7383227Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7383629Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.7383864Z FAILED [8.8170s] [ 6%] 2025-12-04T12:42:03.7383935Z 2025-12-04T12:42:03.7383993Z =================================== FAILURES =================================== 2025-12-04T12:42:03.7384279Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.7384548Z Traceback (most recent call last): 2025-12-04T12:42:03.7384799Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.7385045Z self._join_processes(fn) 2025-12-04T12:42:03.7385295Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.7385559Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.7385867Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.7386130Z raise RuntimeError(error) 2025-12-04T12:42:03.7386283Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.7386445Z Traceback (most recent call last): 2025-12-04T12:42:03.7386687Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7386932Z getattr(self, test_name)() 2025-12-04T12:42:03.7387163Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7387395Z fn() 2025-12-04T12:42:03.7387597Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7387830Z method(*args, **kwargs) 2025-12-04T12:42:03.7388054Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7388336Z method(*args, **kwargs) 2025-12-04T12:42:03.7388572Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7388820Z with policy(): 2025-12-04T12:42:03.7389035Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7389267Z raise RuntimeError(msg) 2025-12-04T12:42:03.7389775Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7390247Z 2025-12-04T12:42:03.7390322Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7390777Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7391152Z 2025-12-04T12:42:03.7391244Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7391368Z 2025-12-04T12:42:03.7391429Z Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.7391572Z Traceback (most recent call last): 2025-12-04T12:42:03.7391815Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7392057Z getattr(self, test_name)() 2025-12-04T12:42:03.7392289Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7392521Z fn() 2025-12-04T12:42:03.7392726Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7392957Z method(*args, **kwargs) 2025-12-04T12:42:03.7393176Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7393405Z method(*args, **kwargs) 2025-12-04T12:42:03.7393622Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7393845Z with policy(): 2025-12-04T12:42:03.7394057Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7394290Z raise RuntimeError(msg) 2025-12-04T12:42:03.7394836Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7395302Z 2025-12-04T12:42:03.7395379Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7395828Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7396205Z 2025-12-04T12:42:03.7396292Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7396418Z 2025-12-04T12:42:03.7396476Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.7396619Z Traceback (most recent call last): 2025-12-04T12:42:03.7396862Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7397105Z getattr(self, test_name)() 2025-12-04T12:42:03.7397349Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7397597Z fn() 2025-12-04T12:42:03.7397797Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7398028Z method(*args, **kwargs) 2025-12-04T12:42:03.7398286Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7398519Z method(*args, **kwargs) 2025-12-04T12:42:03.7398739Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7398966Z with policy(): 2025-12-04T12:42:03.7399181Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7399411Z raise RuntimeError(msg) 2025-12-04T12:42:03.7399918Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 952107008 and is now 2843738112. 2025-12-04T12:42:03.7400386Z 2025-12-04T12:42:03.7400461Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7400909Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7401283Z 2025-12-04T12:42:03.7401371Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7401501Z 2025-12-04T12:42:03.7401503Z 2025-12-04T12:42:03.7401583Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.7401791Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.7402191Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-129d46d21b0c8aeb.xml - 2025-12-04T12:42:03.7402562Z =========================== short test summary info ============================ 2025-12-04T12:42:03.7403011Z FAILED [8.8170s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.7403440Z Traceback (most recent call last): 2025-12-04T12:42:03.7403720Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7403967Z getattr(self, test_name)() 2025-12-04T12:42:03.7404200Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7404436Z fn() 2025-12-04T12:42:03.7404637Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7404870Z method(*args, **kwargs) 2025-12-04T12:42:03.7405090Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7405319Z method(*args, **kwargs) 2025-12-04T12:42:03.7405538Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7405765Z with policy(): 2025-12-04T12:42:03.7405980Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7406234Z raise RuntimeError(msg) 2025-12-04T12:42:03.7406737Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7407219Z 2025-12-04T12:42:03.7407296Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7407745Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7408121Z 2025-12-04T12:42:03.7408248Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7408374Z 2025-12-04T12:42:03.7408434Z Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.7408574Z Traceback (most recent call last): 2025-12-04T12:42:03.7408858Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7409106Z getattr(self, test_name)() 2025-12-04T12:42:03.7409338Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7409573Z fn() 2025-12-04T12:42:03.7409774Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7410008Z method(*args, **kwargs) 2025-12-04T12:42:03.7410226Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7410457Z method(*args, **kwargs) 2025-12-04T12:42:03.7410678Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7410907Z with policy(): 2025-12-04T12:42:03.7411118Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7411349Z raise RuntimeError(msg) 2025-12-04T12:42:03.7411849Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7412316Z 2025-12-04T12:42:03.7412391Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7412894Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7413270Z 2025-12-04T12:42:03.7413367Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7413498Z 2025-12-04T12:42:03.7413559Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.7413710Z Traceback (most recent call last): 2025-12-04T12:42:03.7413965Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7414218Z getattr(self, test_name)() 2025-12-04T12:42:03.7414453Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7414698Z fn() 2025-12-04T12:42:03.7414913Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7415167Z method(*args, **kwargs) 2025-12-04T12:42:03.7415392Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7415639Z method(*args, **kwargs) 2025-12-04T12:42:03.7415857Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7416085Z with policy(): 2025-12-04T12:42:03.7416302Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7416543Z raise RuntimeError(msg) 2025-12-04T12:42:03.7417061Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 952107008 and is now 2843738112. 2025-12-04T12:42:03.7417535Z 2025-12-04T12:42:03.7417612Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7418069Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7418485Z 2025-12-04T12:42:03.7418576Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7418773Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.7418942Z ============================== 1 failed in 8.96s =============================== 2025-12-04T12:42:03.7419084Z Got exit code 1 2025-12-04T12:42:03.7419195Z Retrying single test... 2025-12-04T12:42:03.7419502Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a85052cd503004cf.xml 2025-12-04T12:42:03.7419943Z ============================= test session starts ============================== 2025-12-04T12:42:03.7420168Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7420369Z cachedir: .pytest_cache 2025-12-04T12:42:03.7420604Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7420853Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7420982Z configfile: pytest.ini 2025-12-04T12:42:03.7421223Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7421826Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7422277Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7422717Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7423174Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7423334Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.7423767Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7424184Z Running 1 items in this shard 2025-12-04T12:42:03.7424263Z 2025-12-04T12:42:03.7424685Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:33:43.922000 462209 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 462278 2025-12-04T12:42:03.7425325Z I1204 12:33:43.922000 462209 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 462279 2025-12-04T12:42:03.7425681Z I1204 12:33:43.923000 462209 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 462280 2025-12-04T12:42:03.7426034Z I1204 12:33:43.924000 462209 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 462281 2025-12-04T12:42:03.7426920Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7427682Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7428468Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7429242Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7429994Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7430739Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7431506Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7432247Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7433609Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7435057Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7436490Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7437931Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7439417Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7440841Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7442309Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7443743Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7444039Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7444371Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7444860Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7445350Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7445841Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7446276Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7446709Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7447174Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7447628Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7448078Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7448563Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7449013Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7449461Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7449920Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7450657Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7451362Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7451742Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7452444Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7453055Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7453416Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7453832Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.7454169Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7454499Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7454991Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7455494Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7455969Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7456410Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7456844Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7457297Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7457749Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7458245Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7458705Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7459152Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7459634Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7460097Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7460839Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7461576Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7461931Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7462627Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7463238Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7463598Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7464012Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7464363Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7464694Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7465192Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7465662Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7466135Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7466580Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7467016Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7467475Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7467939Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7468435Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7468904Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7469341Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7469784Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7470236Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7471006Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2843738112. 2025-12-04T12:42:03.7471710Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7472055Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7472750Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7473363Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7473715Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7474136Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7474491Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7474826Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7475309Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7475783Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7476256Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7476701Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7477135Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7477591Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7478053Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7478550Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7479012Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7479457Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7479905Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7480365Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7481135Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7481844Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7482188Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7482882Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7483497Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7483874Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7484300Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.7484550Z FAILED [8.5152s] [100%] 2025-12-04T12:42:03.7484619Z 2025-12-04T12:42:03.7484686Z =================================== FAILURES =================================== 2025-12-04T12:42:03.7484977Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.7485250Z Traceback (most recent call last): 2025-12-04T12:42:03.7485501Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.7485746Z self._join_processes(fn) 2025-12-04T12:42:03.7486004Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.7486277Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.7486555Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.7486826Z raise RuntimeError(error) 2025-12-04T12:42:03.7486988Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.7487162Z Traceback (most recent call last): 2025-12-04T12:42:03.7487411Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7487664Z getattr(self, test_name)() 2025-12-04T12:42:03.7487910Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7488184Z fn() 2025-12-04T12:42:03.7488399Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7488639Z method(*args, **kwargs) 2025-12-04T12:42:03.7488874Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7489116Z method(*args, **kwargs) 2025-12-04T12:42:03.7489350Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7489588Z with policy(): 2025-12-04T12:42:03.7489811Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7490055Z raise RuntimeError(msg) 2025-12-04T12:42:03.7490594Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7491063Z 2025-12-04T12:42:03.7491138Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7491588Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7491965Z 2025-12-04T12:42:03.7492054Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7492181Z 2025-12-04T12:42:03.7492183Z 2025-12-04T12:42:03.7492264Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.7492468Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.7492886Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a85052cd503004cf.xml - 2025-12-04T12:42:03.7493267Z =========================== short test summary info ============================ 2025-12-04T12:42:03.7493717Z FAILED [8.5152s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.7494143Z Traceback (most recent call last): 2025-12-04T12:42:03.7494390Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7494634Z getattr(self, test_name)() 2025-12-04T12:42:03.7494867Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7495101Z fn() 2025-12-04T12:42:03.7495305Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7495534Z method(*args, **kwargs) 2025-12-04T12:42:03.7495756Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7495984Z method(*args, **kwargs) 2025-12-04T12:42:03.7496209Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7496439Z with policy(): 2025-12-04T12:42:03.7496658Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7496897Z raise RuntimeError(msg) 2025-12-04T12:42:03.7497409Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7497936Z 2025-12-04T12:42:03.7498014Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7498505Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7498884Z 2025-12-04T12:42:03.7498972Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7499195Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.7499365Z ======================= 1 failed, 14 deselected in 8.65s ======================= 2025-12-04T12:42:03.7499508Z Got exit code 1 2025-12-04T12:42:03.7499607Z Retrying single test... 2025-12-04T12:42:03.7499904Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-fd866558b38d3026.xml 2025-12-04T12:42:03.7500229Z ============================= test session starts ============================== 2025-12-04T12:42:03.7500444Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7500636Z cachedir: .pytest_cache 2025-12-04T12:42:03.7500860Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7501102Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7501228Z configfile: pytest.ini 2025-12-04T12:42:03.7501455Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7502721Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7503179Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7503608Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7504044Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7504191Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.7504614Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7505024Z Running 1 items in this shard 2025-12-04T12:42:03.7505099Z 2025-12-04T12:42:03.7505512Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:33:55.026000 462611 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 462680 2025-12-04T12:42:03.7506113Z I1204 12:33:55.027000 462611 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 462681 2025-12-04T12:42:03.7506456Z I1204 12:33:55.028000 462611 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 462682 2025-12-04T12:42:03.7506800Z I1204 12:33:55.028000 462611 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 462683 2025-12-04T12:42:03.7507672Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7508454Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7509226Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7510007Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7510752Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7511493Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7512232Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7513004Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7514343Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7515880Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7517317Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7518776Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7520231Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7521651Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7523071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7524514Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7524811Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7525139Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7525615Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7526086Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7526555Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7526988Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7527412Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7527865Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7528358Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7528807Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7529292Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7529729Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7530171Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7530620Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7531365Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7532063Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7532414Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7533118Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7533720Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7534072Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7534470Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.7534798Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7535122Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7535592Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7536054Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7536521Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7536954Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7537381Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7537831Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7538323Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7538799Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7539247Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7539682Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7540122Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7540570Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7541308Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7542028Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7542363Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7543049Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7543654Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7544003Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7544403Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7544728Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7545049Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7545522Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7545984Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7546451Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7546886Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7547310Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7547755Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7548276Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7548726Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7549173Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7549607Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7550045Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7550497Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7551233Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7551952Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7552288Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7552974Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7553654Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7554005Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7554403Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.7554729Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7555051Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7555523Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7555986Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7556449Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7556882Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7557355Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7557834Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7558319Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7558766Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7559213Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7559651Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7560093Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7560597Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7561346Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1260388352 and is now 2843738112. 2025-12-04T12:42:03.7562043Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7562383Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7563066Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7563671Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7564019Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7564417Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7564652Z FAILED [8.5150s] [100%] 2025-12-04T12:42:03.7564716Z 2025-12-04T12:42:03.7564777Z =================================== FAILURES =================================== 2025-12-04T12:42:03.7565061Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.7565339Z Traceback (most recent call last): 2025-12-04T12:42:03.7565590Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.7565836Z self._join_processes(fn) 2025-12-04T12:42:03.7566084Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.7566352Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.7566626Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.7566891Z raise RuntimeError(error) 2025-12-04T12:42:03.7567088Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.7567263Z Traceback (most recent call last): 2025-12-04T12:42:03.7567512Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7567765Z getattr(self, test_name)() 2025-12-04T12:42:03.7568007Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7568275Z fn() 2025-12-04T12:42:03.7568491Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7568724Z method(*args, **kwargs) 2025-12-04T12:42:03.7568948Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7569182Z method(*args, **kwargs) 2025-12-04T12:42:03.7569408Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7569635Z with policy(): 2025-12-04T12:42:03.7569870Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7570123Z raise RuntimeError(msg) 2025-12-04T12:42:03.7570628Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7571095Z 2025-12-04T12:42:03.7571173Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7571628Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7572013Z 2025-12-04T12:42:03.7572105Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7572236Z 2025-12-04T12:42:03.7572239Z 2025-12-04T12:42:03.7572320Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.7572529Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.7572932Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-fd866558b38d3026.xml - 2025-12-04T12:42:03.7573309Z =========================== short test summary info ============================ 2025-12-04T12:42:03.7573765Z FAILED [8.5150s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.7574202Z Traceback (most recent call last): 2025-12-04T12:42:03.7574454Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7574708Z getattr(self, test_name)() 2025-12-04T12:42:03.7574947Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7575186Z fn() 2025-12-04T12:42:03.7575397Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7575635Z method(*args, **kwargs) 2025-12-04T12:42:03.7575861Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7576096Z method(*args, **kwargs) 2025-12-04T12:42:03.7576350Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7576586Z with policy(): 2025-12-04T12:42:03.7576804Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7577042Z raise RuntimeError(msg) 2025-12-04T12:42:03.7577554Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7578026Z 2025-12-04T12:42:03.7578108Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7578608Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7578996Z 2025-12-04T12:42:03.7579093Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7579300Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.7579470Z ======================= 1 failed, 14 deselected in 8.65s ======================= 2025-12-04T12:42:03.7579611Z Got exit code 1 2025-12-04T12:42:03.7579956Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7580411Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.7580808Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-2cd233f9856036a5.xml 2025-12-04T12:42:03.7581135Z ============================= test session starts ============================== 2025-12-04T12:42:03.7581346Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7581541Z cachedir: .pytest_cache 2025-12-04T12:42:03.7581767Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7582007Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7582129Z configfile: pytest.ini 2025-12-04T12:42:03.7582357Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7582918Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7583357Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7583792Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7584232Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7584381Z collected 15 items / 1 deselected / 14 selected 2025-12-04T12:42:03.7584524Z stepcurrent: skipping 1 already run items. 2025-12-04T12:42:03.7584652Z Running 14 items in this shard 2025-12-04T12:42:03.7584727Z 2025-12-04T12:42:03.7585168Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:34:06.091000 463013 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 463082 2025-12-04T12:42:03.7585767Z I1204 12:34:06.092000 463013 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 463083 2025-12-04T12:42:03.7586115Z I1204 12:34:06.092000 463013 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 463084 2025-12-04T12:42:03.7586458Z I1204 12:34:06.093000 463013 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 463085 2025-12-04T12:42:03.7587337Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7588105Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7588887Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7589653Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7590392Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7591137Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7591874Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7592614Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7593967Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7595392Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7596843Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7598377Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7599901Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7601338Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7602772Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7604188Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7604484Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7604814Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7605290Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7605784Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7606252Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7606691Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7607117Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7607568Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7608019Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7608516Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7608982Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7609420Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7609862Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7610315Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7611084Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7611779Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7612117Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7612808Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7613413Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7613765Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7614167Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.7614495Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7614818Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7615321Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7615788Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7616253Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7616685Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7617108Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7617556Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7618018Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7618518Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7618969Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7619404Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7619847Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7620300Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7621034Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1101004800 and is now 2843738112. 2025-12-04T12:42:03.7621725Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7622062Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7622745Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7623352Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7623705Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7624106Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.7624432Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7624783Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7625256Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7625722Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7626190Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7626624Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7627049Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7627513Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7627976Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7628462Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7628911Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7629352Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7629791Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7630242Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7630972Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7631667Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7632003Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7632688Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7633289Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7633649Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7634112Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7634443Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7634768Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7635240Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7635704Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7636166Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7636599Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7637047Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7637511Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7637961Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7638442Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7638894Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7639336Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7639777Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7640228Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7640962Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7641656Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7641996Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7642678Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7643282Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7643665Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7644068Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7644305Z FAILED [8.6128s] [ 7%] 2025-12-04T12:42:03.7644374Z 2025-12-04T12:42:03.7644434Z =================================== FAILURES =================================== 2025-12-04T12:42:03.7644714Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.7644982Z Traceback (most recent call last): 2025-12-04T12:42:03.7645228Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.7645477Z self._join_processes(fn) 2025-12-04T12:42:03.7645727Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.7646007Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.7646277Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.7664272Z raise RuntimeError(error) 2025-12-04T12:42:03.7664434Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.7664598Z Traceback (most recent call last): 2025-12-04T12:42:03.7664848Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7665094Z getattr(self, test_name)() 2025-12-04T12:42:03.7665327Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7665559Z fn() 2025-12-04T12:42:03.7665766Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7666002Z method(*args, **kwargs) 2025-12-04T12:42:03.7666221Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7666452Z method(*args, **kwargs) 2025-12-04T12:42:03.7666669Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7666897Z with policy(): 2025-12-04T12:42:03.7667109Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7667340Z raise RuntimeError(msg) 2025-12-04T12:42:03.7667856Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7668369Z 2025-12-04T12:42:03.7668449Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7668903Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7669278Z 2025-12-04T12:42:03.7669373Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7669499Z 2025-12-04T12:42:03.7669501Z 2025-12-04T12:42:03.7669583Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.7669786Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.7670261Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-2cd233f9856036a5.xml - 2025-12-04T12:42:03.7670629Z =========================== short test summary info ============================ 2025-12-04T12:42:03.7671082Z FAILED [8.6128s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.7671508Z Traceback (most recent call last): 2025-12-04T12:42:03.7671759Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7672005Z getattr(self, test_name)() 2025-12-04T12:42:03.7672238Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7672472Z fn() 2025-12-04T12:42:03.7672677Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7672921Z method(*args, **kwargs) 2025-12-04T12:42:03.7673140Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7673384Z method(*args, **kwargs) 2025-12-04T12:42:03.7673602Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7673823Z with policy(): 2025-12-04T12:42:03.7674033Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7674263Z raise RuntimeError(msg) 2025-12-04T12:42:03.7674769Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7675239Z 2025-12-04T12:42:03.7675316Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7675764Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7676138Z 2025-12-04T12:42:03.7676226Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7676410Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.7676573Z ======================= 1 failed, 1 deselected in 8.75s ======================== 2025-12-04T12:42:03.7676709Z Got exit code 1 2025-12-04T12:42:03.7676807Z Retrying single test... 2025-12-04T12:42:03.7677096Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-57b0d531b0940846.xml 2025-12-04T12:42:03.7677415Z ============================= test session starts ============================== 2025-12-04T12:42:03.7677624Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7677813Z cachedir: .pytest_cache 2025-12-04T12:42:03.7678034Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7678314Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7678432Z configfile: pytest.ini 2025-12-04T12:42:03.7678663Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7679253Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7679692Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7680125Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7680563Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7680707Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.7681124Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7681529Z Running 1 items in this shard 2025-12-04T12:42:03.7681603Z 2025-12-04T12:42:03.7682028Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:34:17.171000 463415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 463484 2025-12-04T12:42:03.7682643Z I1204 12:34:17.172000 463415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 463485 2025-12-04T12:42:03.7682985Z I1204 12:34:17.173000 463415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 463486 2025-12-04T12:42:03.7683323Z I1204 12:34:17.173000 463415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 463487 2025-12-04T12:42:03.7684199Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7684955Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7685694Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7686438Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7687169Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7687909Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7688732Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7689473Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7690818Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7692263Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7693699Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7695110Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7696531Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7697945Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7699426Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7700831Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7701124Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7701448Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7701922Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7702423Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7702885Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7703314Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7703734Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7704179Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7704625Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7705076Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7705523Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7705958Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7706396Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7706842Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7707583Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1260388352 and is now 2843738112. 2025-12-04T12:42:03.7708308Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7708674Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7709360Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7709957Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7710306Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7710704Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7711030Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7711348Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7711828Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7712344Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7712803Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7713233Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7713661Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7714110Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7714554Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7714995Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7715439Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7715871Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7716305Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7716750Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7717476Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7718226Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7718560Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7719243Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7719837Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7720182Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7720581Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7720922Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7721263Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7721732Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7722191Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7722652Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7723079Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7723500Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7723944Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7724388Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7724829Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7725274Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7725706Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7726140Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7726584Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7727334Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7728021Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7728382Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7729061Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7729669Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7730016Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7730428Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.7730766Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7731085Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7731552Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7732011Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7732533Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7732966Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7733384Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7733829Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7734276Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7734719Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7735164Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7735600Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7736035Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7736482Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7737240Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7737929Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7738297Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7739016Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7739632Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7739992Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7740387Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.7740626Z FAILED [8.6151s] [100%] 2025-12-04T12:42:03.7740691Z 2025-12-04T12:42:03.7740752Z =================================== FAILURES =================================== 2025-12-04T12:42:03.7741033Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.7741300Z Traceback (most recent call last): 2025-12-04T12:42:03.7741546Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.7741791Z self._join_processes(fn) 2025-12-04T12:42:03.7742037Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.7742300Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.7742569Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.7742826Z raise RuntimeError(error) 2025-12-04T12:42:03.7742978Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.7743138Z Traceback (most recent call last): 2025-12-04T12:42:03.7743377Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7743619Z getattr(self, test_name)() 2025-12-04T12:42:03.7743849Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7744080Z fn() 2025-12-04T12:42:03.7744282Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7744511Z method(*args, **kwargs) 2025-12-04T12:42:03.7744731Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7744958Z method(*args, **kwargs) 2025-12-04T12:42:03.7745176Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7745399Z with policy(): 2025-12-04T12:42:03.7745609Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7745867Z raise RuntimeError(msg) 2025-12-04T12:42:03.7746370Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1260388352 and is now 2843738112. 2025-12-04T12:42:03.7746838Z 2025-12-04T12:42:03.7746913Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7747362Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7747735Z 2025-12-04T12:42:03.7747825Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7747951Z 2025-12-04T12:42:03.7747956Z 2025-12-04T12:42:03.7748033Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.7748277Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.7748671Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-57b0d531b0940846.xml - 2025-12-04T12:42:03.7749051Z =========================== short test summary info ============================ 2025-12-04T12:42:03.7749497Z FAILED [8.6151s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.7749922Z Traceback (most recent call last): 2025-12-04T12:42:03.7750170Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7750412Z getattr(self, test_name)() 2025-12-04T12:42:03.7750646Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7750875Z fn() 2025-12-04T12:42:03.7751073Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7751300Z method(*args, **kwargs) 2025-12-04T12:42:03.7751517Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7751743Z method(*args, **kwargs) 2025-12-04T12:42:03.7751958Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7752180Z with policy(): 2025-12-04T12:42:03.7752391Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7752619Z raise RuntimeError(msg) 2025-12-04T12:42:03.7753122Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1260388352 and is now 2843738112. 2025-12-04T12:42:03.7753586Z 2025-12-04T12:42:03.7753662Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7754110Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7754484Z 2025-12-04T12:42:03.7754599Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7754786Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.7754949Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.7755088Z Got exit code 1 2025-12-04T12:42:03.7755184Z Retrying single test... 2025-12-04T12:42:03.7755477Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ccab4a1f1ee4cc50.xml 2025-12-04T12:42:03.7755796Z ============================= test session starts ============================== 2025-12-04T12:42:03.7756006Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7756191Z cachedir: .pytest_cache 2025-12-04T12:42:03.7756412Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7756652Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7756773Z configfile: pytest.ini 2025-12-04T12:42:03.7757019Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7757584Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7758019Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7758488Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7758924Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7759071Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.7759489Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7759895Z Running 1 items in this shard 2025-12-04T12:42:03.7759968Z 2025-12-04T12:42:03.7760377Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:34:28.239000 463817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 463886 2025-12-04T12:42:03.7760973Z I1204 12:34:28.240000 463817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 463887 2025-12-04T12:42:03.7761320Z I1204 12:34:28.241000 463817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 463888 2025-12-04T12:42:03.7761661Z I1204 12:34:28.242000 463817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 463889 2025-12-04T12:42:03.7762529Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7763315Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7764084Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7764831Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7765565Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7766302Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7767038Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7767803Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7769182Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7770597Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7772022Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7773438Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7774885Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7776293Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7777707Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7779182Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7779477Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7779804Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7780275Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7780737Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7781201Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7781632Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7782057Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7782508Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7782956Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7783401Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7783876Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7784311Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7784746Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7785193Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7785925Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1105199104 and is now 2843738112. 2025-12-04T12:42:03.7786633Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7786968Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7787665Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7788298Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7788648Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7789047Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.7789369Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7789689Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7790668Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7791126Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7791588Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7792014Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7792433Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7792878Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7793322Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7793801Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7794248Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7794680Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7795113Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7795557Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7796289Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7797007Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7797339Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7798020Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7798663Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7799010Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7799405Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7799728Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7800046Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7800512Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7800972Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7801433Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7801861Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7802281Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7802724Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7803201Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7803343Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7803611Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7803741Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7804010Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7804154Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7804716Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.7804838Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7805027Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7805487Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7805598Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7805801Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7805959Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.7806091Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7806242Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7806522Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7806668Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7806946Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7807060Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7807352Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7807513Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7807786Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7807928Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7808235Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7808365Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7808636Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7808792Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7810109Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.7810217Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7810408Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7810863Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7810971Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7811173Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7811332Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7811372Z FAILED [8.6133s] [100%] 2025-12-04T12:42:03.7811376Z 2025-12-04T12:42:03.7811435Z =================================== FAILURES =================================== 2025-12-04T12:42:03.7811620Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.7811668Z Traceback (most recent call last): 2025-12-04T12:42:03.7811833Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.7811877Z self._join_processes(fn) 2025-12-04T12:42:03.7812051Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.7812104Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.7812284Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.7812328Z raise RuntimeError(error) 2025-12-04T12:42:03.7812436Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.7812482Z Traceback (most recent call last): 2025-12-04T12:42:03.7812647Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7812691Z getattr(self, test_name)() 2025-12-04T12:42:03.7812853Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7812887Z fn() 2025-12-04T12:42:03.7813040Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7813082Z method(*args, **kwargs) 2025-12-04T12:42:03.7813273Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7813314Z method(*args, **kwargs) 2025-12-04T12:42:03.7813469Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7813518Z with policy(): 2025-12-04T12:42:03.7813672Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7813724Z raise RuntimeError(msg) 2025-12-04T12:42:03.7814156Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1105199104 and is now 2843738112. 2025-12-04T12:42:03.7814159Z 2025-12-04T12:42:03.7814238Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7814577Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7814580Z 2025-12-04T12:42:03.7814672Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7814675Z 2025-12-04T12:42:03.7814677Z 2025-12-04T12:42:03.7814753Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.7814845Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.7815120Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ccab4a1f1ee4cc50.xml - 2025-12-04T12:42:03.7815182Z =========================== short test summary info ============================ 2025-12-04T12:42:03.7815531Z FAILED [8.6133s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.7815579Z Traceback (most recent call last): 2025-12-04T12:42:03.7815746Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7815790Z getattr(self, test_name)() 2025-12-04T12:42:03.7815950Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7815986Z fn() 2025-12-04T12:42:03.7816138Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7816178Z method(*args, **kwargs) 2025-12-04T12:42:03.7816329Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7816396Z method(*args, **kwargs) 2025-12-04T12:42:03.7816548Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7816586Z with policy(): 2025-12-04T12:42:03.7816739Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7816782Z raise RuntimeError(msg) 2025-12-04T12:42:03.7817214Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1105199104 and is now 2843738112. 2025-12-04T12:42:03.7817216Z 2025-12-04T12:42:03.7817294Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7817633Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7817645Z 2025-12-04T12:42:03.7817734Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7817807Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.7817870Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.7817907Z Got exit code 1 2025-12-04T12:42:03.7818231Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7818360Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.7818591Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-822400ffddb1145d.xml 2025-12-04T12:42:03.7818651Z ============================= test session starts ============================== 2025-12-04T12:42:03.7818766Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7818810Z cachedir: .pytest_cache 2025-12-04T12:42:03.7818968Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7819017Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7819059Z configfile: pytest.ini 2025-12-04T12:42:03.7819225Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7819584Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7819638Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7819982Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7820043Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7820100Z collected 15 items / 2 deselected / 13 selected 2025-12-04T12:42:03.7820156Z stepcurrent: skipping 2 already run items. 2025-12-04T12:42:03.7820199Z Running 13 items in this shard 2025-12-04T12:42:03.7820201Z 2025-12-04T12:42:03.7820635Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:34:39.393000 464219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 464288 2025-12-04T12:42:03.7820794Z I1204 12:34:39.394000 464219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 464289 2025-12-04T12:42:03.7820949Z I1204 12:34:39.395000 464219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 464290 2025-12-04T12:42:03.7821101Z I1204 12:34:39.395000 464219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 464291 2025-12-04T12:42:03.7821784Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7821846Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7822516Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7822572Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7823239Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7823282Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7823952Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7823995Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7825266Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7825413Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7826677Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7826803Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7828080Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7828251Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7829514Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7829635Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7829769Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7829925Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7830207Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7830379Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7830659Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7830777Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7831045Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7831190Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7831463Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7831618Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7831906Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7832035Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7832307Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7832449Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7833002Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7833114Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7833302Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7833767Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7833876Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7834080Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7834237Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7834369Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7834522Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7834821Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7834969Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7835246Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7835362Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7835629Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7835772Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7836054Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7836205Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7836477Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7836605Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7836876Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7837017Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7837567Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7837677Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7837867Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7838349Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7838458Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7838660Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7838818Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7838980Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7839133Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7839414Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7839561Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7839836Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7839952Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7840220Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7840373Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7840653Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7840794Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7841066Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7841197Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7841470Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7841612Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7842160Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7842269Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7842462Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7842924Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7843032Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7843236Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7843412Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.7843545Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7843698Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7843975Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7844120Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7844398Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7844514Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7844791Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7844944Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7845213Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7845355Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7845625Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7845752Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7846024Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7846164Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7846713Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.7846822Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7847013Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7847468Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7847574Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7847795Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7847953Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.7847995Z FAILED [8.5138s] [ 7%] 2025-12-04T12:42:03.7847997Z 2025-12-04T12:42:03.7848054Z =================================== FAILURES =================================== 2025-12-04T12:42:03.7848271Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.7848318Z Traceback (most recent call last): 2025-12-04T12:42:03.7848485Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.7848530Z self._join_processes(fn) 2025-12-04T12:42:03.7848707Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.7848781Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.7848962Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.7849020Z raise RuntimeError(error) 2025-12-04T12:42:03.7849102Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.7849148Z Traceback (most recent call last): 2025-12-04T12:42:03.7849308Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7849353Z getattr(self, test_name)() 2025-12-04T12:42:03.7849510Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7849547Z fn() 2025-12-04T12:42:03.7849700Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7849744Z method(*args, **kwargs) 2025-12-04T12:42:03.7849894Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7849937Z method(*args, **kwargs) 2025-12-04T12:42:03.7850086Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7850126Z with policy(): 2025-12-04T12:42:03.7850277Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7850320Z raise RuntimeError(msg) 2025-12-04T12:42:03.7850755Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7850758Z 2025-12-04T12:42:03.7850836Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7851208Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7851212Z 2025-12-04T12:42:03.7851305Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7851308Z 2025-12-04T12:42:03.7851310Z 2025-12-04T12:42:03.7851387Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.7851475Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.7851777Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-822400ffddb1145d.xml - 2025-12-04T12:42:03.7851841Z =========================== short test summary info ============================ 2025-12-04T12:42:03.7852189Z FAILED [8.5138s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.7852236Z Traceback (most recent call last): 2025-12-04T12:42:03.7852402Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7852446Z getattr(self, test_name)() 2025-12-04T12:42:03.7852605Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7852644Z fn() 2025-12-04T12:42:03.7852795Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7852848Z method(*args, **kwargs) 2025-12-04T12:42:03.7853010Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7853051Z method(*args, **kwargs) 2025-12-04T12:42:03.7853200Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7853238Z with policy(): 2025-12-04T12:42:03.7853389Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7853431Z raise RuntimeError(msg) 2025-12-04T12:42:03.7853862Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7853865Z 2025-12-04T12:42:03.7853943Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7854280Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7854284Z 2025-12-04T12:42:03.7854372Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7854438Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.7854501Z ======================= 1 failed, 2 deselected in 8.65s ======================== 2025-12-04T12:42:03.7854542Z Got exit code 1 2025-12-04T12:42:03.7854582Z Retrying single test... 2025-12-04T12:42:03.7854811Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-62bd11c9460b4997.xml 2025-12-04T12:42:03.7854870Z ============================= test session starts ============================== 2025-12-04T12:42:03.7854984Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7855025Z cachedir: .pytest_cache 2025-12-04T12:42:03.7855189Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7855235Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7855275Z configfile: pytest.ini 2025-12-04T12:42:03.7855437Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7855815Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7855868Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7856213Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7856272Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7856329Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.7856664Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7856708Z Running 1 items in this shard 2025-12-04T12:42:03.7856720Z 2025-12-04T12:42:03.7857127Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:34:50.303000 464621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 464690 2025-12-04T12:42:03.7857293Z I1204 12:34:50.304000 464621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 464691 2025-12-04T12:42:03.7857446Z I1204 12:34:50.305000 464621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 464692 2025-12-04T12:42:03.7857596Z I1204 12:34:50.305000 464621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 464693 2025-12-04T12:42:03.7858312Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7858359Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7859030Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7859075Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7859745Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7859789Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7860485Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7860528Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7861803Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7861945Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7863225Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7863350Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7864653Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7864777Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7866063Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7866184Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7866319Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7866475Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7866760Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7866929Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7867208Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7867427Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7867698Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7867841Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7868111Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7868303Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7868574Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7868703Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7868976Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7869118Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7869671Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.7869781Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7870003Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7870465Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7870574Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7870779Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7870936Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.7871070Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7871236Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7871529Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7871675Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7871955Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7872074Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7872342Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7872486Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7872753Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7872895Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7873164Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7873293Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7873565Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7873706Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7874282Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7874390Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7874580Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7875039Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7875145Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7875351Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7875508Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.7875650Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7875812Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7876092Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7876238Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7876517Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7876634Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7876905Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7877046Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7877313Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7877455Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7877722Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7877853Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7878124Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7878309Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7878882Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440. 2025-12-04T12:42:03.7878991Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7879184Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7879642Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7879752Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7879955Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7880145Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7880275Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7880428Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7880709Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7880856Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7881133Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7881249Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7881521Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7881661Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7881931Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7882071Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7882340Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7882469Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7882742Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7882916Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7883468Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7883578Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7883771Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7884228Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7884350Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7884569Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7884728Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7884773Z FAILED [8.6149s] [100%] 2025-12-04T12:42:03.7884776Z 2025-12-04T12:42:03.7884833Z =================================== FAILURES =================================== 2025-12-04T12:42:03.7885021Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.7885071Z Traceback (most recent call last): 2025-12-04T12:42:03.7885239Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.7885285Z self._join_processes(fn) 2025-12-04T12:42:03.7885463Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.7885518Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.7885698Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.7885742Z raise RuntimeError(error) 2025-12-04T12:42:03.7885825Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.7885870Z Traceback (most recent call last): 2025-12-04T12:42:03.7886035Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7886080Z getattr(self, test_name)() 2025-12-04T12:42:03.7886241Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7886277Z fn() 2025-12-04T12:42:03.7886432Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7886477Z method(*args, **kwargs) 2025-12-04T12:42:03.7886628Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7886673Z method(*args, **kwargs) 2025-12-04T12:42:03.7886822Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7886862Z with policy(): 2025-12-04T12:42:03.7887035Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7887080Z raise RuntimeError(msg) 2025-12-04T12:42:03.7887513Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.7887516Z 2025-12-04T12:42:03.7887593Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7887931Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7887934Z 2025-12-04T12:42:03.7888027Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7888031Z 2025-12-04T12:42:03.7888092Z Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.7888188Z Traceback (most recent call last): 2025-12-04T12:42:03.7888353Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7888412Z getattr(self, test_name)() 2025-12-04T12:42:03.7888574Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7888610Z fn() 2025-12-04T12:42:03.7888764Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7888804Z method(*args, **kwargs) 2025-12-04T12:42:03.7888963Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7889002Z method(*args, **kwargs) 2025-12-04T12:42:03.7889157Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7889195Z with policy(): 2025-12-04T12:42:03.7889351Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7889393Z raise RuntimeError(msg) 2025-12-04T12:42:03.7889829Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7889832Z 2025-12-04T12:42:03.7889907Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7890249Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7890252Z 2025-12-04T12:42:03.7890342Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7890345Z 2025-12-04T12:42:03.7890347Z 2025-12-04T12:42:03.7890423Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.7890512Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.7890782Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-62bd11c9460b4997.xml - 2025-12-04T12:42:03.7890845Z =========================== short test summary info ============================ 2025-12-04T12:42:03.7891218Z FAILED [8.6149s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.7891268Z Traceback (most recent call last): 2025-12-04T12:42:03.7891432Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7891477Z getattr(self, test_name)() 2025-12-04T12:42:03.7891638Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7891673Z fn() 2025-12-04T12:42:03.7891825Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7891865Z method(*args, **kwargs) 2025-12-04T12:42:03.7892016Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7892057Z method(*args, **kwargs) 2025-12-04T12:42:03.7892210Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7892259Z with policy(): 2025-12-04T12:42:03.7892423Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7892464Z raise RuntimeError(msg) 2025-12-04T12:42:03.7892895Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.7892897Z 2025-12-04T12:42:03.7892971Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7893310Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7893313Z 2025-12-04T12:42:03.7893400Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7893402Z 2025-12-04T12:42:03.7893461Z Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.7893508Z Traceback (most recent call last): 2025-12-04T12:42:03.7893671Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7893714Z getattr(self, test_name)() 2025-12-04T12:42:03.7893873Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7893909Z fn() 2025-12-04T12:42:03.7894061Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7894104Z method(*args, **kwargs) 2025-12-04T12:42:03.7894254Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7894296Z method(*args, **kwargs) 2025-12-04T12:42:03.7894445Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7894484Z with policy(): 2025-12-04T12:42:03.7894637Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7894682Z raise RuntimeError(msg) 2025-12-04T12:42:03.7895131Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7895136Z 2025-12-04T12:42:03.7895210Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7895545Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7895548Z 2025-12-04T12:42:03.7895634Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7895700Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.7895761Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.7895800Z Got exit code 1 2025-12-04T12:42:03.7895840Z Retrying single test... 2025-12-04T12:42:03.7896070Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f354e4e5939e2ac6.xml 2025-12-04T12:42:03.7896138Z ============================= test session starts ============================== 2025-12-04T12:42:03.7896270Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7896312Z cachedir: .pytest_cache 2025-12-04T12:42:03.7896472Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7896519Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7896562Z configfile: pytest.ini 2025-12-04T12:42:03.7896726Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7897087Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7897141Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7897484Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7897545Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7897603Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.7897932Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7897976Z Running 1 items in this shard 2025-12-04T12:42:03.7897978Z 2025-12-04T12:42:03.7898426Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:35:01.468000 465023 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 465092 2025-12-04T12:42:03.7898585Z I1204 12:35:01.468000 465023 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 465093 2025-12-04T12:42:03.7898741Z I1204 12:35:01.469000 465023 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 465094 2025-12-04T12:42:03.7898893Z I1204 12:35:01.470000 465023 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 465095 2025-12-04T12:42:03.7899599Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7899647Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7900315Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7900359Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7901033Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7901100Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7901771Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7901812Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7903091Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7903222Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7904506Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7904631Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7905898Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7906039Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7907360Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7907483Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7907619Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7907775Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7908060Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7908259Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7908539Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7908656Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7908925Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7909095Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7909369Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7909511Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7909782Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7909910Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7910183Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7910335Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7910904Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1105199104 and is now 2820669440. 2025-12-04T12:42:03.7911016Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7911209Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7911668Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7911778Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7911983Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7912141Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7912273Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7912427Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7912709Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7912858Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7913134Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7913250Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7913541Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7913686Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7913953Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7914095Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7914393Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7914524Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7914812Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7914963Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7915515Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7915626Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7915819Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7916278Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7916386Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7916591Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7916750Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.7916882Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7917034Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7917314Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7917464Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7917792Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7917910Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7918216Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7918360Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7918629Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7918771Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7919042Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7919183Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7919467Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7919608Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7920164Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.7920273Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7920466Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7920922Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7921029Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7921234Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7921392Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.7921525Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7921677Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7921959Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7922109Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7922413Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7922531Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7922799Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7922940Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7923207Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7923350Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7923629Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7923769Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7924041Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7924182Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7924735Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7924844Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7925034Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7925532Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7925640Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7925845Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7926004Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7926048Z FAILED [11.2178s] [100%] 2025-12-04T12:42:03.7926050Z 2025-12-04T12:42:03.7926108Z =================================== FAILURES =================================== 2025-12-04T12:42:03.7926293Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.7926340Z Traceback (most recent call last): 2025-12-04T12:42:03.7926526Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.7926572Z self._join_processes(fn) 2025-12-04T12:42:03.7926748Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.7926803Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.7926983Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.7927028Z raise RuntimeError(error) 2025-12-04T12:42:03.7927111Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.7927157Z Traceback (most recent call last): 2025-12-04T12:42:03.7927318Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7927361Z getattr(self, test_name)() 2025-12-04T12:42:03.7927522Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7927572Z fn() 2025-12-04T12:42:03.7927724Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7927777Z method(*args, **kwargs) 2025-12-04T12:42:03.7927928Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7927970Z method(*args, **kwargs) 2025-12-04T12:42:03.7928121Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7928196Z with policy(): 2025-12-04T12:42:03.7928348Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7928392Z raise RuntimeError(msg) 2025-12-04T12:42:03.7928828Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1105199104 and is now 2820669440. 2025-12-04T12:42:03.7928832Z 2025-12-04T12:42:03.7928910Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7929249Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7929253Z 2025-12-04T12:42:03.7929341Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7929343Z 2025-12-04T12:42:03.7929345Z 2025-12-04T12:42:03.7929424Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.7929512Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.7929787Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f354e4e5939e2ac6.xml - 2025-12-04T12:42:03.7929848Z =========================== short test summary info ============================ 2025-12-04T12:42:03.7930198Z FAILED [11.2178s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.7930244Z Traceback (most recent call last): 2025-12-04T12:42:03.7930411Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7930493Z getattr(self, test_name)() 2025-12-04T12:42:03.7930657Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7930694Z fn() 2025-12-04T12:42:03.7930846Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7930888Z method(*args, **kwargs) 2025-12-04T12:42:03.7931039Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7931079Z method(*args, **kwargs) 2025-12-04T12:42:03.7931229Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7931269Z with policy(): 2025-12-04T12:42:03.7931421Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7931466Z raise RuntimeError(msg) 2025-12-04T12:42:03.7931898Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1105199104 and is now 2820669440. 2025-12-04T12:42:03.7931924Z 2025-12-04T12:42:03.7932001Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7932337Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7932342Z 2025-12-04T12:42:03.7932429Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7932496Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.7932559Z ====================== 1 failed, 14 deselected in 11.35s ======================= 2025-12-04T12:42:03.7932599Z Got exit code 1 2025-12-04T12:42:03.7932882Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.7933013Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.7933242Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-6cfddfd95d5027e8.xml 2025-12-04T12:42:03.7933302Z ============================= test session starts ============================== 2025-12-04T12:42:03.7933415Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7933460Z cachedir: .pytest_cache 2025-12-04T12:42:03.7933618Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7933667Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7933710Z configfile: pytest.ini 2025-12-04T12:42:03.7933875Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7934234Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7934289Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7934657Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7934716Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7934773Z collected 15 items / 3 deselected / 12 selected 2025-12-04T12:42:03.7934828Z stepcurrent: skipping 3 already run items. 2025-12-04T12:42:03.7934872Z Running 12 items in this shard 2025-12-04T12:42:03.7934874Z 2025-12-04T12:42:03.7935282Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:35:15.336000 465425 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 465494 2025-12-04T12:42:03.7935440Z I1204 12:35:15.337000 465425 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 465495 2025-12-04T12:42:03.7935593Z I1204 12:35:15.337000 465425 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 465496 2025-12-04T12:42:03.7935745Z I1204 12:35:15.338000 465425 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 465497 2025-12-04T12:42:03.7936439Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7936494Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7937165Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7937209Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7937880Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7937923Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7938626Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7938671Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7939970Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7940097Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7941364Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7941516Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7942787Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7942909Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7944173Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7944318Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7944452Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7944611Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7944896Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7945045Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7945327Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7945445Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7945728Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7945881Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7946152Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7946292Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7946564Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7946697Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7946968Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7947110Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7947667Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1107296256 and is now 2820669440. 2025-12-04T12:42:03.7947780Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7947971Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7948463Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7955030Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7955298Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7955461Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7955594Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7955747Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7956030Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7956178Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7956465Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7956597Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7956883Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7957023Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7957291Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7957433Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7957702Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7957830Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7958103Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7958295Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7958852Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7958965Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7959156Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7959642Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7959753Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7959957Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7960120Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7960251Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7960405Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7960688Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7960837Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7961127Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7961256Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7961526Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7961665Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7961936Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7962076Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7962345Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7962472Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7962744Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7962886Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7963437Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.7963545Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7963733Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7964209Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7964319Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7964523Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7964682Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.7964851Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7965009Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7965289Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7965457Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7965733Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7965848Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7966119Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7966259Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7966530Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7966670Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7966940Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7967067Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7967339Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7967479Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7968027Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7968135Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7968396Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7968853Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7968960Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7969164Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7969322Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.7969363Z FAILED [8.6153s] [ 8%] 2025-12-04T12:42:03.7969368Z 2025-12-04T12:42:03.7969427Z =================================== FAILURES =================================== 2025-12-04T12:42:03.7969628Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.7969690Z Traceback (most recent call last): 2025-12-04T12:42:03.7969855Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.7969901Z self._join_processes(fn) 2025-12-04T12:42:03.7970074Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.7970129Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.7970308Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.7970355Z raise RuntimeError(error) 2025-12-04T12:42:03.7970436Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.7970484Z Traceback (most recent call last): 2025-12-04T12:42:03.7970645Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7970689Z getattr(self, test_name)() 2025-12-04T12:42:03.7970846Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7970881Z fn() 2025-12-04T12:42:03.7971032Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7971074Z method(*args, **kwargs) 2025-12-04T12:42:03.7971224Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7971264Z method(*args, **kwargs) 2025-12-04T12:42:03.7971415Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7971455Z with policy(): 2025-12-04T12:42:03.7971607Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7971648Z raise RuntimeError(msg) 2025-12-04T12:42:03.7972079Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1107296256 and is now 2820669440. 2025-12-04T12:42:03.7972082Z 2025-12-04T12:42:03.7972157Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7972520Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7972524Z 2025-12-04T12:42:03.7972614Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7972617Z 2025-12-04T12:42:03.7972620Z 2025-12-04T12:42:03.7972698Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.7972787Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.7973061Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-6cfddfd95d5027e8.xml - 2025-12-04T12:42:03.7973122Z =========================== short test summary info ============================ 2025-12-04T12:42:03.7973471Z FAILED [8.6153s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.7973529Z Traceback (most recent call last): 2025-12-04T12:42:03.7973704Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7973747Z getattr(self, test_name)() 2025-12-04T12:42:03.7973907Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7973943Z fn() 2025-12-04T12:42:03.7974095Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7974134Z method(*args, **kwargs) 2025-12-04T12:42:03.7974285Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7974327Z method(*args, **kwargs) 2025-12-04T12:42:03.7974478Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7974516Z with policy(): 2025-12-04T12:42:03.7974669Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7974710Z raise RuntimeError(msg) 2025-12-04T12:42:03.7975141Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1107296256 and is now 2820669440. 2025-12-04T12:42:03.7975143Z 2025-12-04T12:42:03.7975217Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7975558Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7975561Z 2025-12-04T12:42:03.7975649Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7975714Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.7975777Z ======================= 1 failed, 3 deselected in 8.75s ======================== 2025-12-04T12:42:03.7975815Z Got exit code 1 2025-12-04T12:42:03.7975855Z Retrying single test... 2025-12-04T12:42:03.7976084Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f7d3e22584c96aa7.xml 2025-12-04T12:42:03.7976141Z ============================= test session starts ============================== 2025-12-04T12:42:03.7976280Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.7976321Z cachedir: .pytest_cache 2025-12-04T12:42:03.7976481Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.7976529Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.7976569Z configfile: pytest.ini 2025-12-04T12:42:03.7976734Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.7977095Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7977148Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.7977493Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.7977563Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.7977631Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.7977960Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7978004Z Running 1 items in this shard 2025-12-04T12:42:03.7978006Z 2025-12-04T12:42:03.7978522Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:35:26.592000 465827 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 465896 2025-12-04T12:42:03.7978678Z I1204 12:35:26.593000 465827 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 465897 2025-12-04T12:42:03.7978830Z I1204 12:35:26.593000 465827 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 465898 2025-12-04T12:42:03.7978982Z I1204 12:35:26.594000 465827 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 465899 2025-12-04T12:42:03.7979672Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7979718Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7980393Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7980437Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7981136Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7981179Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7981852Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.7981895Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.7983172Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7983326Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7984598Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7984725Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7986016Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7986139Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7987402Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.7987533Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.7987677Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7987833Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7988117Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7988300Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7988581Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7988700Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7988970Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7989111Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7989383Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7989523Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7989793Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7989923Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7990194Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7990333Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7990918Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7991031Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7991221Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7991679Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7991799Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7992005Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7992177Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.7992308Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7992462Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7992743Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7992892Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7993169Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7993286Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7993554Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7993696Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7993965Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7994105Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7994374Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7994501Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7994795Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7994935Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.7995484Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.7995594Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7995783Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.7996241Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.7996366Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.7996568Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.7996726Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.7996858Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.7997013Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.7997293Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.7997440Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.7997715Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.7997830Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.7998099Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7998291Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7998561Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.7998700Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.7998968Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.7999124Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.7999397Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.7999540Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8000087Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8000195Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8000384Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8000861Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8000980Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8001182Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8001340Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8001470Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8001623Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8001904Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8002052Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8002329Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8002445Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8002713Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8002855Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8003124Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8003264Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8003571Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8003699Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8003971Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8004110Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8004660Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8004779Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8004968Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8005435Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8005543Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8005749Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8005905Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8005947Z FAILED [8.7142s] [100%] 2025-12-04T12:42:03.8005950Z 2025-12-04T12:42:03.8006006Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8006188Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8006235Z Traceback (most recent call last): 2025-12-04T12:42:03.8006398Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8006443Z self._join_processes(fn) 2025-12-04T12:42:03.8006616Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8006671Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8006849Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8006894Z raise RuntimeError(error) 2025-12-04T12:42:03.8006975Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8007021Z Traceback (most recent call last): 2025-12-04T12:42:03.8007181Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8007223Z getattr(self, test_name)() 2025-12-04T12:42:03.8007381Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8007416Z fn() 2025-12-04T12:42:03.8007587Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8007628Z method(*args, **kwargs) 2025-12-04T12:42:03.8007780Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8007819Z method(*args, **kwargs) 2025-12-04T12:42:03.8007969Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8008006Z with policy(): 2025-12-04T12:42:03.8008190Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8008233Z raise RuntimeError(msg) 2025-12-04T12:42:03.8008667Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8008671Z 2025-12-04T12:42:03.8008762Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8009098Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8009113Z 2025-12-04T12:42:03.8009202Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8009204Z 2025-12-04T12:42:03.8009206Z 2025-12-04T12:42:03.8009281Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8009369Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8009644Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f7d3e22584c96aa7.xml - 2025-12-04T12:42:03.8009705Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8010053Z FAILED [8.7142s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8010101Z Traceback (most recent call last): 2025-12-04T12:42:03.8010265Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8010308Z getattr(self, test_name)() 2025-12-04T12:42:03.8010468Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8010502Z fn() 2025-12-04T12:42:03.8010655Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8010696Z method(*args, **kwargs) 2025-12-04T12:42:03.8010848Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8010888Z method(*args, **kwargs) 2025-12-04T12:42:03.8011037Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8011074Z with policy(): 2025-12-04T12:42:03.8011225Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8011265Z raise RuntimeError(msg) 2025-12-04T12:42:03.8011721Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8011725Z 2025-12-04T12:42:03.8011800Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8012137Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8012140Z 2025-12-04T12:42:03.8012226Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8012289Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8012350Z ======================= 1 failed, 14 deselected in 8.85s ======================= 2025-12-04T12:42:03.8012388Z Got exit code 1 2025-12-04T12:42:03.8012427Z Retrying single test... 2025-12-04T12:42:03.8012658Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-b0ce32d3573b3332.xml 2025-12-04T12:42:03.8012728Z ============================= test session starts ============================== 2025-12-04T12:42:03.8012852Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8012893Z cachedir: .pytest_cache 2025-12-04T12:42:03.8013053Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8013098Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8013138Z configfile: pytest.ini 2025-12-04T12:42:03.8013302Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8013663Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8013716Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8014059Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8014119Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8014176Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8014503Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8014546Z Running 1 items in this shard 2025-12-04T12:42:03.8014550Z 2025-12-04T12:42:03.8014960Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:35:37.708000 466229 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 466298 2025-12-04T12:42:03.8015118Z I1204 12:35:37.709000 466229 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 466299 2025-12-04T12:42:03.8015270Z I1204 12:35:37.710000 466229 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 466300 2025-12-04T12:42:03.8015458Z I1204 12:35:37.710000 466229 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 466301 2025-12-04T12:42:03.8016159Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8016205Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8016877Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8016921Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8017690Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8017760Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8018471Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8018511Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8019797Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.8019925Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.8021228Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.8021353Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.8022617Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.8022763Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.8024034Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:42:03.8024159Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:42:03.8024292Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8024447Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8024732Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8024882Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8025161Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8025278Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8025550Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8025722Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8025993Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8026134Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8026402Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8026531Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8026804Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8026956Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8027520Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8027632Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8027824Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8028315Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8028426Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8028632Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8028789Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8028921Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8029075Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8029355Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8029503Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8029779Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8029895Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8030191Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8030335Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8030604Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8032990Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8033272Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8033404Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8033699Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8033857Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8034424Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1098907648 and is now 2820669440. 2025-12-04T12:42:03.8034534Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8034727Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8035185Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8035294Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8035576Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8035735Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8035869Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8036021Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8036304Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8036451Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8036742Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8036860Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8037132Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8037274Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8037621Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8037763Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8038031Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8038203Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8038493Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8038635Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8039187Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8039295Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8039488Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8039948Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8040057Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8040261Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8040419Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8040550Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8040702Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8040982Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8041127Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8041418Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8041534Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8041803Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8041972Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8042240Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8042381Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8042662Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8042801Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8043072Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8043213Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8043762Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8043871Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8044060Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8044519Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8044627Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8044828Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8044986Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8045028Z FAILED [8.4138s] [100%] 2025-12-04T12:42:03.8045030Z 2025-12-04T12:42:03.8045088Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8045273Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8045319Z Traceback (most recent call last): 2025-12-04T12:42:03.8045493Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8045538Z self._join_processes(fn) 2025-12-04T12:42:03.8045713Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8045768Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8045948Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8045991Z raise RuntimeError(error) 2025-12-04T12:42:03.8046072Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8046130Z Traceback (most recent call last): 2025-12-04T12:42:03.8046292Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8046334Z getattr(self, test_name)() 2025-12-04T12:42:03.8046494Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8046538Z fn() 2025-12-04T12:42:03.8046693Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8046748Z method(*args, **kwargs) 2025-12-04T12:42:03.8046901Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8046940Z method(*args, **kwargs) 2025-12-04T12:42:03.8047093Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8047132Z with policy(): 2025-12-04T12:42:03.8047283Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8047326Z raise RuntimeError(msg) 2025-12-04T12:42:03.8047758Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8047762Z 2025-12-04T12:42:03.8047838Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8048213Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8048215Z 2025-12-04T12:42:03.8048306Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8048308Z 2025-12-04T12:42:03.8048310Z 2025-12-04T12:42:03.8048388Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8048476Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8048751Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-b0ce32d3573b3332.xml - 2025-12-04T12:42:03.8048811Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8049161Z FAILED [8.4138s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8049207Z Traceback (most recent call last): 2025-12-04T12:42:03.8049374Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8049431Z getattr(self, test_name)() 2025-12-04T12:42:03.8049593Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8049630Z fn() 2025-12-04T12:42:03.8049783Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8049823Z method(*args, **kwargs) 2025-12-04T12:42:03.8049975Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8050013Z method(*args, **kwargs) 2025-12-04T12:42:03.8050178Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8050215Z with policy(): 2025-12-04T12:42:03.8050368Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8050410Z raise RuntimeError(msg) 2025-12-04T12:42:03.8050840Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8050870Z 2025-12-04T12:42:03.8050945Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8051283Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8051286Z 2025-12-04T12:42:03.8051375Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8051439Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8051503Z ======================= 1 failed, 14 deselected in 8.55s ======================= 2025-12-04T12:42:03.8051541Z Got exit code 1 2025-12-04T12:42:03.8051825Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8051956Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.8052185Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-61ba864e2f05046f.xml 2025-12-04T12:42:03.8052243Z ============================= test session starts ============================== 2025-12-04T12:42:03.8052355Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8052398Z cachedir: .pytest_cache 2025-12-04T12:42:03.8052556Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8052604Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8052646Z configfile: pytest.ini 2025-12-04T12:42:03.8052809Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8053171Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8053223Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8053586Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8053646Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8053702Z collected 15 items / 4 deselected / 11 selected 2025-12-04T12:42:03.8053758Z stepcurrent: skipping 4 already run items. 2025-12-04T12:42:03.8053800Z Running 11 items in this shard 2025-12-04T12:42:03.8053804Z 2025-12-04T12:42:03.8054228Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:35:48.791000 466631 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 466700 2025-12-04T12:42:03.8054385Z I1204 12:35:48.792000 466631 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 466701 2025-12-04T12:42:03.8054538Z I1204 12:35:48.792000 466631 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 466702 2025-12-04T12:42:03.8054688Z I1204 12:35:48.793000 466631 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 466703 2025-12-04T12:42:03.8055384Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8055439Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8056112Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8056156Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8056825Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8056866Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8057540Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8057584Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8058086Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8058136Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8058680Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8058729Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8059231Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8059277Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8059766Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8059836Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8059973Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8060130Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8060414Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8060563Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8060843Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8060963Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8061235Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8061380Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8061648Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8061789Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8062061Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8062191Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8062465Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8062616Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8063174Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8063285Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8063491Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8063956Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8064086Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8064301Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8064459Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8064591Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8064742Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8065023Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8065169Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8065447Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8065564Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8065832Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8066019Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8066292Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8066434Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8066702Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8066831Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8067111Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8067253Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8067818Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8067925Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8068116Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8068617Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8068757Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8068963Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8069122Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8069254Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8069406Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8069686Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8069834Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8070113Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8070231Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8070498Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8070640Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8070908Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8071049Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8071314Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8071457Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8071731Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8071873Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8072439Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1256194048 and is now 2843738112. 2025-12-04T12:42:03.8072547Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8072747Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8073213Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8073321Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8073523Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8073680Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8073814Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8073966Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8074247Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8074394Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8074671Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8074786Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8075056Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8075198Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8075465Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8075606Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8075882Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8076011Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8076282Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8076435Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8076988Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8077104Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8077306Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8077763Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8077871Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8078074Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8078278Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8078319Z FAILED [8.8150s] [ 9%] 2025-12-04T12:42:03.8078322Z 2025-12-04T12:42:03.8078378Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8078565Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8078613Z Traceback (most recent call last): 2025-12-04T12:42:03.8078778Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8078821Z self._join_processes(fn) 2025-12-04T12:42:03.8078995Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8079049Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8079227Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8079272Z raise RuntimeError(error) 2025-12-04T12:42:03.8079353Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8079398Z Traceback (most recent call last): 2025-12-04T12:42:03.8079562Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8079604Z getattr(self, test_name)() 2025-12-04T12:42:03.8079763Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8079797Z fn() 2025-12-04T12:42:03.8079963Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8080004Z method(*args, **kwargs) 2025-12-04T12:42:03.8080156Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8080197Z method(*args, **kwargs) 2025-12-04T12:42:03.8080347Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8080385Z with policy(): 2025-12-04T12:42:03.8080552Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8080596Z raise RuntimeError(msg) 2025-12-04T12:42:03.8081030Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8081047Z 2025-12-04T12:42:03.8081124Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8081476Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8081479Z 2025-12-04T12:42:03.8081567Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8081570Z 2025-12-04T12:42:03.8081572Z 2025-12-04T12:42:03.8081648Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8081736Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8082010Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-61ba864e2f05046f.xml - 2025-12-04T12:42:03.8082071Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8082422Z FAILED [8.8150s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8082468Z Traceback (most recent call last): 2025-12-04T12:42:03.8082636Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8082678Z getattr(self, test_name)() 2025-12-04T12:42:03.8082840Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8082873Z fn() 2025-12-04T12:42:03.8083026Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8083067Z method(*args, **kwargs) 2025-12-04T12:42:03.8083220Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8083259Z method(*args, **kwargs) 2025-12-04T12:42:03.8083410Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8083449Z with policy(): 2025-12-04T12:42:03.8083602Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8083643Z raise RuntimeError(msg) 2025-12-04T12:42:03.8084086Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8084089Z 2025-12-04T12:42:03.8084165Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8084504Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8084507Z 2025-12-04T12:42:03.8084607Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8084671Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8084734Z ======================= 1 failed, 4 deselected in 8.98s ======================== 2025-12-04T12:42:03.8084771Z Got exit code 1 2025-12-04T12:42:03.8084811Z Retrying single test... 2025-12-04T12:42:03.8085054Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-8fc19adac4e61b04.xml 2025-12-04T12:42:03.8085126Z ============================= test session starts ============================== 2025-12-04T12:42:03.8085240Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8085281Z cachedir: .pytest_cache 2025-12-04T12:42:03.8085439Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8085484Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8085525Z configfile: pytest.ini 2025-12-04T12:42:03.8085687Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8086045Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8086097Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8086440Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8086498Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8086557Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8086889Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8086935Z Running 1 items in this shard 2025-12-04T12:42:03.8086939Z 2025-12-04T12:42:03.8087347Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:36:00.205000 467033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 467102 2025-12-04T12:42:03.8087503Z I1204 12:36:00.205000 467033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 467103 2025-12-04T12:42:03.8087656Z I1204 12:36:00.206000 467033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 467104 2025-12-04T12:42:03.8087805Z I1204 12:36:00.207000 467033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 467105 2025-12-04T12:42:03.8088530Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8088576Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8089260Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8089305Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8089974Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8090044Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8090712Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8090753Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8091252Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8091302Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8091794Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8091843Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8092330Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8092378Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8092882Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8092931Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8093067Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8093226Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8093509Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8095085Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8095368Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8095497Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8095769Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8095920Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8096192Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8096330Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8096601Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8096732Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8097002Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8097144Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8097701Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 958398464 and is now 2843738112. 2025-12-04T12:42:03.8097812Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8098003Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8098501Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8098624Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8098828Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8098989Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8099121Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8099274Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8099567Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8099715Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8099993Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8100138Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8100407Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8100548Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8100818Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8100958Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8101229Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8101360Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8101630Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8101772Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8102323Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8102434Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8102624Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8103090Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8103199Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8103400Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8103559Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8103706Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8103861Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8104140Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8104298Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8104584Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8104698Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8104966Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8105107Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8105375Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8105515Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8105783Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8105911Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8106184Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8106325Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8106876Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8106985Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8107173Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8107640Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8107749Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8107949Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8108122Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8108284Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8108438Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8108731Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8108890Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8109167Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8109282Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8109551Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8109691Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8109961Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8110098Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8110367Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8110495Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8110766Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8110910Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8111460Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8111581Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8111770Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8112228Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8112336Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8112549Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8112709Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8112748Z FAILED [8.9157s] [100%] 2025-12-04T12:42:03.8112760Z 2025-12-04T12:42:03.8112816Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8113000Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8113057Z Traceback (most recent call last): 2025-12-04T12:42:03.8113220Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8113265Z self._join_processes(fn) 2025-12-04T12:42:03.8113438Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8113491Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8113670Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8113715Z raise RuntimeError(error) 2025-12-04T12:42:03.8113794Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8113840Z Traceback (most recent call last): 2025-12-04T12:42:03.8114002Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8114045Z getattr(self, test_name)() 2025-12-04T12:42:03.8114203Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8114238Z fn() 2025-12-04T12:42:03.8114390Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8114430Z method(*args, **kwargs) 2025-12-04T12:42:03.8114582Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8114622Z method(*args, **kwargs) 2025-12-04T12:42:03.8114774Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8114812Z with policy(): 2025-12-04T12:42:03.8114966Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8115007Z raise RuntimeError(msg) 2025-12-04T12:42:03.8115447Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8115450Z 2025-12-04T12:42:03.8115524Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8115877Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8115881Z 2025-12-04T12:42:03.8115969Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8115972Z 2025-12-04T12:42:03.8116032Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8116078Z Traceback (most recent call last): 2025-12-04T12:42:03.8116249Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8116293Z getattr(self, test_name)() 2025-12-04T12:42:03.8116473Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8116520Z fn() 2025-12-04T12:42:03.8116673Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8116726Z method(*args, **kwargs) 2025-12-04T12:42:03.8116875Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8116925Z method(*args, **kwargs) 2025-12-04T12:42:03.8117074Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8117112Z with policy(): 2025-12-04T12:42:03.8117263Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8117305Z raise RuntimeError(msg) 2025-12-04T12:42:03.8117736Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 958398464 and is now 2843738112. 2025-12-04T12:42:03.8117739Z 2025-12-04T12:42:03.8117814Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8118192Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8118195Z 2025-12-04T12:42:03.8118281Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8118285Z 2025-12-04T12:42:03.8118287Z 2025-12-04T12:42:03.8118363Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8118450Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8118725Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-8fc19adac4e61b04.xml - 2025-12-04T12:42:03.8118786Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8119138Z FAILED [8.9157s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8119183Z Traceback (most recent call last): 2025-12-04T12:42:03.8119349Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8119391Z getattr(self, test_name)() 2025-12-04T12:42:03.8119568Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8119604Z fn() 2025-12-04T12:42:03.8119754Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8119797Z method(*args, **kwargs) 2025-12-04T12:42:03.8119948Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8119987Z method(*args, **kwargs) 2025-12-04T12:42:03.8120136Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8120174Z with policy(): 2025-12-04T12:42:03.8120338Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8120380Z raise RuntimeError(msg) 2025-12-04T12:42:03.8120814Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8120828Z 2025-12-04T12:42:03.8120915Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8121254Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8121258Z 2025-12-04T12:42:03.8121346Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8121348Z 2025-12-04T12:42:03.8121408Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8121453Z Traceback (most recent call last): 2025-12-04T12:42:03.8121618Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8121659Z getattr(self, test_name)() 2025-12-04T12:42:03.8121821Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8121856Z fn() 2025-12-04T12:42:03.8122010Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8122049Z method(*args, **kwargs) 2025-12-04T12:42:03.8122199Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8122239Z method(*args, **kwargs) 2025-12-04T12:42:03.8122389Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8122426Z with policy(): 2025-12-04T12:42:03.8122578Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8122620Z raise RuntimeError(msg) 2025-12-04T12:42:03.8123053Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 958398464 and is now 2843738112. 2025-12-04T12:42:03.8123056Z 2025-12-04T12:42:03.8123129Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8123465Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8123468Z 2025-12-04T12:42:03.8123565Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8123630Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8123697Z ======================= 1 failed, 14 deselected in 9.06s ======================= 2025-12-04T12:42:03.8123735Z Got exit code 1 2025-12-04T12:42:03.8123777Z Retrying single test... 2025-12-04T12:42:03.8124002Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-549814058b19f53a.xml 2025-12-04T12:42:03.8124062Z ============================= test session starts ============================== 2025-12-04T12:42:03.8124183Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8124226Z cachedir: .pytest_cache 2025-12-04T12:42:03.8124383Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8124430Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8124470Z configfile: pytest.ini 2025-12-04T12:42:03.8124643Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8125013Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8125064Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8125410Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8125467Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8125525Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8125855Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8125902Z Running 1 items in this shard 2025-12-04T12:42:03.8125904Z 2025-12-04T12:42:03.8126317Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:36:11.683000 467435 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 467504 2025-12-04T12:42:03.8126473Z I1204 12:36:11.684000 467435 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 467505 2025-12-04T12:42:03.8126626Z I1204 12:36:11.684000 467435 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 467506 2025-12-04T12:42:03.8126776Z I1204 12:36:11.685000 467435 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 467507 2025-12-04T12:42:03.8127458Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8127502Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8128227Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8128272Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8128956Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8128999Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8129667Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8129735Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8130232Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8130280Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8130770Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8130818Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8131312Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8131358Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8131844Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8131891Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8132025Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8132181Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8132463Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8132620Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8132899Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8133016Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8133297Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8133440Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8133711Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8133860Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8134146Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8134275Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8134547Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8134689Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8135246Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8135358Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8135547Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8136008Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8136119Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8136322Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8136482Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8136614Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8136767Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8137055Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8137203Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8137480Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8137606Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8137875Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8138016Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8138335Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8138487Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8138758Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8138886Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8139158Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8139299Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8139853Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8139961Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8140151Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8140608Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8140716Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8140919Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8141077Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8141218Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8141372Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8141651Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8141798Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8142085Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8142200Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8142468Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8142623Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8142901Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8143040Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8143308Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8143436Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8143708Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8143849Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8144401Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1249902592 and is now 2843738112. 2025-12-04T12:42:03.8144509Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8144699Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8145163Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8145270Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8145473Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8145638Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8145769Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8145922Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8146200Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8146356Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8146633Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8146750Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8147027Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8147180Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8147453Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8147591Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8147859Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8147988Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8148295Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8148436Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8148987Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8149097Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8149287Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8149747Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8149854Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8150070Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8150228Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8150270Z FAILED [8.7142s] [100%] 2025-12-04T12:42:03.8150272Z 2025-12-04T12:42:03.8150329Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8150513Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8150572Z Traceback (most recent call last): 2025-12-04T12:42:03.8150735Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8150779Z self._join_processes(fn) 2025-12-04T12:42:03.8150951Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8151023Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8151200Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8151258Z raise RuntimeError(error) 2025-12-04T12:42:03.8151337Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8151384Z Traceback (most recent call last): 2025-12-04T12:42:03.8151544Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8151588Z getattr(self, test_name)() 2025-12-04T12:42:03.8151745Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8151780Z fn() 2025-12-04T12:42:03.8151932Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8151974Z method(*args, **kwargs) 2025-12-04T12:42:03.8152125Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8152167Z method(*args, **kwargs) 2025-12-04T12:42:03.8152317Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8152354Z with policy(): 2025-12-04T12:42:03.8152506Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8152548Z raise RuntimeError(msg) 2025-12-04T12:42:03.8152982Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8152985Z 2025-12-04T12:42:03.8153059Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8153400Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8153402Z 2025-12-04T12:42:03.8153489Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8153492Z 2025-12-04T12:42:03.8153493Z 2025-12-04T12:42:03.8153572Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8153662Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8153942Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-549814058b19f53a.xml - 2025-12-04T12:42:03.8154005Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8154354Z FAILED [8.7142s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8154402Z Traceback (most recent call last): 2025-12-04T12:42:03.8154579Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8154624Z getattr(self, test_name)() 2025-12-04T12:42:03.8154784Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8154821Z fn() 2025-12-04T12:42:03.8154973Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8155025Z method(*args, **kwargs) 2025-12-04T12:42:03.8155188Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8155230Z method(*args, **kwargs) 2025-12-04T12:42:03.8155379Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8155419Z with policy(): 2025-12-04T12:42:03.8155572Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8155615Z raise RuntimeError(msg) 2025-12-04T12:42:03.8156050Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8156055Z 2025-12-04T12:42:03.8156132Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8156475Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8156477Z 2025-12-04T12:42:03.8156565Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8156629Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8156690Z ======================= 1 failed, 14 deselected in 8.85s ======================= 2025-12-04T12:42:03.8156728Z Got exit code 1 2025-12-04T12:42:03.8157013Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8157146Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.8157373Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-41e14a589dc61213.xml 2025-12-04T12:42:03.8157435Z ============================= test session starts ============================== 2025-12-04T12:42:03.8157551Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8157593Z cachedir: .pytest_cache 2025-12-04T12:42:03.8157765Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8157810Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8157855Z configfile: pytest.ini 2025-12-04T12:42:03.8158018Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8158419Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8158471Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8158832Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8158892Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8158950Z collected 15 items / 5 deselected / 10 selected 2025-12-04T12:42:03.8159016Z stepcurrent: skipping 5 already run items. 2025-12-04T12:42:03.8159063Z Running 10 items in this shard 2025-12-04T12:42:03.8159065Z 2025-12-04T12:42:03.8159484Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:36:23.035000 467837 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 467906 2025-12-04T12:42:03.8159642Z I1204 12:36:23.036000 467837 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 467907 2025-12-04T12:42:03.8159798Z I1204 12:36:23.037000 467837 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 467908 2025-12-04T12:42:03.8159950Z I1204 12:36:23.037000 467837 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 467909 2025-12-04T12:42:03.8160637Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8160681Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8161355Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8161403Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8162066Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8162110Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8162792Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8162836Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8163343Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8163391Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8163883Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8163948Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8164449Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8164498Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8164988Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8165037Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8165172Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8165331Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8165613Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8165761Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8166040Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8166157Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8166430Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8166571Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8166841Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8167001Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8167291Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8167421Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8167717Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8167860Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8168465Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2843738112. 2025-12-04T12:42:03.8168601Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8168791Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8169249Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8169359Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8169562Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8169721Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8169851Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8170006Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8170288Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8170436Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8170712Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8170830Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8171100Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8171239Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8171521Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8171662Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8171930Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8172069Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8172343Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8172485Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8173047Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8173166Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8173356Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8173812Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8173920Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8174122Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8174282Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8174413Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8174565Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8174846Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8174996Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8175273Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8175389Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8175670Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8175810Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8176079Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8176218Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8176498Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8176625Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8176897Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8177047Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8177606Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8177715Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8177904Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8178395Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8178503Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8178707Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8178865Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8178995Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8179148Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8179428Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8179576Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8179853Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8179969Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8180252Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8180395Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8180669Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8180821Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8181092Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8181219Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8181501Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8181653Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8182203Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8182314Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8182502Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8182960Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8183068Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8183270Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8183427Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8183467Z FAILED [8.6157s] [ 10%] 2025-12-04T12:42:03.8183470Z 2025-12-04T12:42:03.8183526Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8183709Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8183757Z Traceback (most recent call last): 2025-12-04T12:42:03.8183920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8183965Z self._join_processes(fn) 2025-12-04T12:42:03.8184137Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8184206Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8184383Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8184429Z raise RuntimeError(error) 2025-12-04T12:42:03.8184509Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8184555Z Traceback (most recent call last): 2025-12-04T12:42:03.8184715Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8184759Z getattr(self, test_name)() 2025-12-04T12:42:03.8184926Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8184962Z fn() 2025-12-04T12:42:03.8185113Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8185154Z method(*args, **kwargs) 2025-12-04T12:42:03.8185307Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8185358Z method(*args, **kwargs) 2025-12-04T12:42:03.8185507Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8185556Z with policy(): 2025-12-04T12:42:03.8185707Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8185749Z raise RuntimeError(msg) 2025-12-04T12:42:03.8186182Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2843738112. 2025-12-04T12:42:03.8186186Z 2025-12-04T12:42:03.8186260Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8186599Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8186603Z 2025-12-04T12:42:03.8186690Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8186692Z 2025-12-04T12:42:03.8186694Z 2025-12-04T12:42:03.8186772Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8186859Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8187132Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-41e14a589dc61213.xml - 2025-12-04T12:42:03.8187193Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8187543Z FAILED [8.6157s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8187590Z Traceback (most recent call last): 2025-12-04T12:42:03.8187756Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8187798Z getattr(self, test_name)() 2025-12-04T12:42:03.8187958Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8187994Z fn() 2025-12-04T12:42:03.8188197Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8188239Z method(*args, **kwargs) 2025-12-04T12:42:03.8188389Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8188430Z method(*args, **kwargs) 2025-12-04T12:42:03.8188581Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8188621Z with policy(): 2025-12-04T12:42:03.8188774Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8188814Z raise RuntimeError(msg) 2025-12-04T12:42:03.8189261Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2843738112. 2025-12-04T12:42:03.8189263Z 2025-12-04T12:42:03.8189351Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8189689Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8189706Z 2025-12-04T12:42:03.8189793Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8189858Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8189921Z ======================= 1 failed, 5 deselected in 8.75s ======================== 2025-12-04T12:42:03.8189959Z Got exit code 1 2025-12-04T12:42:03.8189999Z Retrying single test... 2025-12-04T12:42:03.8190229Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-51d18ec349ed0c91.xml 2025-12-04T12:42:03.8190287Z ============================= test session starts ============================== 2025-12-04T12:42:03.8190400Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8190442Z cachedir: .pytest_cache 2025-12-04T12:42:03.8190600Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8190646Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8190687Z configfile: pytest.ini 2025-12-04T12:42:03.8190851Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8191209Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8191263Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8191608Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8191667Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8191722Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8192053Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8192097Z Running 1 items in this shard 2025-12-04T12:42:03.8192099Z 2025-12-04T12:42:03.8192519Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:36:34.342000 468239 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 468308 2025-12-04T12:42:03.8192678Z I1204 12:36:34.343000 468239 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 468309 2025-12-04T12:42:03.8192829Z I1204 12:36:34.343000 468239 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 468310 2025-12-04T12:42:03.8192988Z I1204 12:36:34.344000 468239 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 468311 2025-12-04T12:42:03.8193669Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8193732Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8194403Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8194445Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8195115Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8195159Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8195654Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8195703Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8196370Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8196414Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8196905Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8196953Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8197454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8197501Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8197998Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8198044Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8198216Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8198386Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8198687Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8198834Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8199115Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8199234Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8199505Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8199649Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8199918Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8200061Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8200329Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8200458Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8200730Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8200871Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8201437Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1256194048 and is now 2843738112. 2025-12-04T12:42:03.8201547Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8201739Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8202198Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8202323Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8202530Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8202687Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8202829Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8202992Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8203273Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8203418Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8203697Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8203813Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8204083Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8204223Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8204492Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8204633Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8204900Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8205030Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8205299Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8205441Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8206004Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8206115Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8206305Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8206770Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8206880Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8207083Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8207258Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8207389Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8207541Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8207822Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8207968Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8208283Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8208399Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8208670Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8208810Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8209078Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8209218Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8209486Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8209615Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8209886Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8210039Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8210590Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8210699Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8210900Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8211356Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8211476Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8211769Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8211926Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8212057Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8212211Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8212492Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8212638Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8212918Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8213032Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8213303Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8213444Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8213712Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8213854Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8214120Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8214248Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8214527Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8216888Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8217452Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8217618Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8217811Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8218317Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8218472Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8218675Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8218834Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8218876Z FAILED [8.7139s] [100%] 2025-12-04T12:42:03.8218881Z 2025-12-04T12:42:03.8218941Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8219124Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8219174Z Traceback (most recent call last): 2025-12-04T12:42:03.8219341Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8219386Z self._join_processes(fn) 2025-12-04T12:42:03.8219562Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8219616Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8219795Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8219838Z raise RuntimeError(error) 2025-12-04T12:42:03.8219920Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8219965Z Traceback (most recent call last): 2025-12-04T12:42:03.8220128Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8220173Z getattr(self, test_name)() 2025-12-04T12:42:03.8220333Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8220367Z fn() 2025-12-04T12:42:03.8220518Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8220559Z method(*args, **kwargs) 2025-12-04T12:42:03.8220711Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8220750Z method(*args, **kwargs) 2025-12-04T12:42:03.8220916Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8220954Z with policy(): 2025-12-04T12:42:03.8221109Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8221150Z raise RuntimeError(msg) 2025-12-04T12:42:03.8221592Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8221594Z 2025-12-04T12:42:03.8221686Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8222027Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8222041Z 2025-12-04T12:42:03.8222132Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8222134Z 2025-12-04T12:42:03.8222207Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8222254Z Traceback (most recent call last): 2025-12-04T12:42:03.8222416Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8222460Z getattr(self, test_name)() 2025-12-04T12:42:03.8222619Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8222654Z fn() 2025-12-04T12:42:03.8222805Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8222844Z method(*args, **kwargs) 2025-12-04T12:42:03.8222997Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8223036Z method(*args, **kwargs) 2025-12-04T12:42:03.8223187Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8223224Z with policy(): 2025-12-04T12:42:03.8223376Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8223416Z raise RuntimeError(msg) 2025-12-04T12:42:03.8223848Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1256194048 and is now 2843738112. 2025-12-04T12:42:03.8223851Z 2025-12-04T12:42:03.8223925Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8224269Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8224273Z 2025-12-04T12:42:03.8224360Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8224363Z 2025-12-04T12:42:03.8224365Z 2025-12-04T12:42:03.8224441Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8224530Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8224804Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-51d18ec349ed0c91.xml - 2025-12-04T12:42:03.8224876Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8225223Z FAILED [8.7139s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8225270Z Traceback (most recent call last): 2025-12-04T12:42:03.8225434Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8225477Z getattr(self, test_name)() 2025-12-04T12:42:03.8225647Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8225683Z fn() 2025-12-04T12:42:03.8225835Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8225877Z method(*args, **kwargs) 2025-12-04T12:42:03.8226027Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8226077Z method(*args, **kwargs) 2025-12-04T12:42:03.8226239Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8226276Z with policy(): 2025-12-04T12:42:03.8226428Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8226468Z raise RuntimeError(msg) 2025-12-04T12:42:03.8226905Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8226907Z 2025-12-04T12:42:03.8226980Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8227317Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8227320Z 2025-12-04T12:42:03.8227408Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8227410Z 2025-12-04T12:42:03.8227470Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8227515Z Traceback (most recent call last): 2025-12-04T12:42:03.8227677Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8227719Z getattr(self, test_name)() 2025-12-04T12:42:03.8227877Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8227912Z fn() 2025-12-04T12:42:03.8228062Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8228104Z method(*args, **kwargs) 2025-12-04T12:42:03.8228292Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8228331Z method(*args, **kwargs) 2025-12-04T12:42:03.8228480Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8228518Z with policy(): 2025-12-04T12:42:03.8228670Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8228710Z raise RuntimeError(msg) 2025-12-04T12:42:03.8229156Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1256194048 and is now 2843738112. 2025-12-04T12:42:03.8229160Z 2025-12-04T12:42:03.8229233Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8229580Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8229584Z 2025-12-04T12:42:03.8229669Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8229733Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8229797Z ======================= 1 failed, 14 deselected in 8.86s ======================= 2025-12-04T12:42:03.8229834Z Got exit code 1 2025-12-04T12:42:03.8229887Z Retrying single test... 2025-12-04T12:42:03.8230113Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-1995dd2608cbad94.xml 2025-12-04T12:42:03.8230185Z ============================= test session starts ============================== 2025-12-04T12:42:03.8230299Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8230340Z cachedir: .pytest_cache 2025-12-04T12:42:03.8230500Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8230546Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8230586Z configfile: pytest.ini 2025-12-04T12:42:03.8230751Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8231111Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8231162Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8231508Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8231567Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8231623Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8231953Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8231997Z Running 1 items in this shard 2025-12-04T12:42:03.8231999Z 2025-12-04T12:42:03.8232405Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:36:45.688000 468641 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 468710 2025-12-04T12:42:03.8232561Z I1204 12:36:45.689000 468641 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 468711 2025-12-04T12:42:03.8232715Z I1204 12:36:45.689000 468641 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 468712 2025-12-04T12:42:03.8232866Z I1204 12:36:45.690000 468641 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 468713 2025-12-04T12:42:03.8233559Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8233607Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8234292Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8234347Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8234844Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8234906Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8235585Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8235627Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8236294Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8236337Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8236830Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8236878Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8237364Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8237412Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8237912Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8237960Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8238095Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8238294Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8238589Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8238736Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8239014Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8239142Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8239426Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8239567Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8239836Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8239977Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8240246Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8240377Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8240646Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8240788Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8241342Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1107296256 and is now 2843738112. 2025-12-04T12:42:03.8241454Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8241643Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8242100Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8242224Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8242428Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8242588Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8242719Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8242880Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8243160Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8243306Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8243592Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8243718Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8243987Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8244127Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8244397Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8244538Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8244806Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8244935Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8245208Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8245350Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8245902Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8246011Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8246201Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8246665Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8246777Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8247062Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8247220Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8247366Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8247519Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8247801Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8247958Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8248283Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8248399Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8248665Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8248807Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8249075Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8249216Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8249483Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8249613Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8249882Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8250023Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8250576Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208. 2025-12-04T12:42:03.8250686Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8250895Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8251350Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8251458Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8251671Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8251828Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8251958Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8252111Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8252401Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8252560Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8252841Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8252954Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8253223Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8253362Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8253630Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8253768Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8254034Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8254162Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8254430Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8254572Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8255123Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8255241Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8255430Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8255886Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8256006Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8256207Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8256365Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8256416Z FAILED [8.7146s] [100%] 2025-12-04T12:42:03.8256418Z 2025-12-04T12:42:03.8256476Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8256669Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8256715Z Traceback (most recent call last): 2025-12-04T12:42:03.8256877Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8256923Z self._join_processes(fn) 2025-12-04T12:42:03.8257096Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8257151Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8257332Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8257376Z raise RuntimeError(error) 2025-12-04T12:42:03.8257457Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8257503Z Traceback (most recent call last): 2025-12-04T12:42:03.8257664Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8257706Z getattr(self, test_name)() 2025-12-04T12:42:03.8257865Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8257899Z fn() 2025-12-04T12:42:03.8258050Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8258090Z method(*args, **kwargs) 2025-12-04T12:42:03.8258285Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8258325Z method(*args, **kwargs) 2025-12-04T12:42:03.8258475Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8258513Z with policy(): 2025-12-04T12:42:03.8258666Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8258706Z raise RuntimeError(msg) 2025-12-04T12:42:03.8259138Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8259141Z 2025-12-04T12:42:03.8259231Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8259569Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8259573Z 2025-12-04T12:42:03.8259661Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8259664Z 2025-12-04T12:42:03.8259665Z 2025-12-04T12:42:03.8259741Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8259842Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8260117Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-1995dd2608cbad94.xml - 2025-12-04T12:42:03.8260179Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8260538Z FAILED [8.7146s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8260597Z Traceback (most recent call last): 2025-12-04T12:42:03.8260762Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8260803Z getattr(self, test_name)() 2025-12-04T12:42:03.8260964Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8260998Z fn() 2025-12-04T12:42:03.8261149Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8261190Z method(*args, **kwargs) 2025-12-04T12:42:03.8261340Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8261379Z method(*args, **kwargs) 2025-12-04T12:42:03.8261530Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8261566Z with policy(): 2025-12-04T12:42:03.8261718Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8261757Z raise RuntimeError(msg) 2025-12-04T12:42:03.8262192Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112. 2025-12-04T12:42:03.8262195Z 2025-12-04T12:42:03.8262268Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8262606Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8262609Z 2025-12-04T12:42:03.8262697Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8262759Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8262821Z ======================= 1 failed, 14 deselected in 8.85s ======================= 2025-12-04T12:42:03.8262858Z Got exit code 1 2025-12-04T12:42:03.8263153Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8263282Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.8263511Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-30cedd8bdfe9b548.xml 2025-12-04T12:42:03.8263570Z ============================= test session starts ============================== 2025-12-04T12:42:03.8263682Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8263722Z cachedir: .pytest_cache 2025-12-04T12:42:03.8263893Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8263939Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8263980Z configfile: pytest.ini 2025-12-04T12:42:03.8264143Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8264512Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8264580Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8264925Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8264982Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8265035Z collected 15 items / 6 deselected / 9 selected 2025-12-04T12:42:03.8265087Z stepcurrent: skipping 6 already run items. 2025-12-04T12:42:03.8265130Z Running 9 items in this shard 2025-12-04T12:42:03.8265133Z 2025-12-04T12:42:03.8265541Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:36:56.876000 469043 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 469112 2025-12-04T12:42:03.8265697Z I1204 12:36:56.876000 469043 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 469113 2025-12-04T12:42:03.8265849Z I1204 12:36:56.877000 469043 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 469114 2025-12-04T12:42:03.8265999Z I1204 12:36:56.878000 469043 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 469115 2025-12-04T12:42:03.8266685Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8266730Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8267398Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8267441Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8268141Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8268252Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8268938Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8268979Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8269493Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8269555Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8270047Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8270097Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8270582Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8270629Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8271114Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8271160Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8271295Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8271450Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8271735Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8271881Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8272159Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8272289Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8272559Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8272701Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8272981Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8273121Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8273389Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8273528Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8273810Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8273951Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8274507Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8274617Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8274807Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8275264Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8275371Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8275573Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8275732Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8275866Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8276019Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8276300Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8276447Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8276735Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8276850Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8277119Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8277270Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8277539Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8277680Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8277960Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8278099Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8278414Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8278556Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8279110Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440. 2025-12-04T12:42:03.8279218Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8279407Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8279863Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8279971Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8280174Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8280334Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8280463Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8280618Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8280911Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8281058Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8281336Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8281451Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8281732Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8281872Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8282141Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8282301Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8282583Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8282712Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8282981Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8283121Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8283674Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8283782Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8283971Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8284428Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8284536Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8284737Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8284894Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8285024Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8285177Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8285464Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8285611Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8285888Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8286012Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8286281Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8286420Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8286698Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8286847Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8287115Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8287243Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8287512Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8287654Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8288386Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8288494Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8288684Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8289139Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8289248Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8289453Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8289609Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8289648Z FAILED [8.7146s] [ 11%] 2025-12-04T12:42:03.8289667Z 2025-12-04T12:42:03.8289724Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8289907Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8289955Z Traceback (most recent call last): 2025-12-04T12:42:03.8290119Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8290163Z self._join_processes(fn) 2025-12-04T12:42:03.8290351Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8290404Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8290583Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8290626Z raise RuntimeError(error) 2025-12-04T12:42:03.8290706Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8290764Z Traceback (most recent call last): 2025-12-04T12:42:03.8290925Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8290983Z getattr(self, test_name)() 2025-12-04T12:42:03.8291142Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8291176Z fn() 2025-12-04T12:42:03.8291328Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8291367Z method(*args, **kwargs) 2025-12-04T12:42:03.8291518Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8291557Z method(*args, **kwargs) 2025-12-04T12:42:03.8291708Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8291746Z with policy(): 2025-12-04T12:42:03.8291897Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8291938Z raise RuntimeError(msg) 2025-12-04T12:42:03.8292372Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8292375Z 2025-12-04T12:42:03.8292449Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8292786Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8292789Z 2025-12-04T12:42:03.8292877Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8292880Z 2025-12-04T12:42:03.8292937Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8292983Z Traceback (most recent call last): 2025-12-04T12:42:03.8293145Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8293186Z getattr(self, test_name)() 2025-12-04T12:42:03.8293346Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8293380Z fn() 2025-12-04T12:42:03.8293540Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8293583Z method(*args, **kwargs) 2025-12-04T12:42:03.8293733Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8293772Z method(*args, **kwargs) 2025-12-04T12:42:03.8293923Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8293960Z with policy(): 2025-12-04T12:42:03.8294111Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8294152Z raise RuntimeError(msg) 2025-12-04T12:42:03.8294593Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440. 2025-12-04T12:42:03.8294595Z 2025-12-04T12:42:03.8294678Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8295014Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8295025Z 2025-12-04T12:42:03.8295111Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8295113Z 2025-12-04T12:42:03.8295115Z 2025-12-04T12:42:03.8295192Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8295282Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8295555Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-30cedd8bdfe9b548.xml - 2025-12-04T12:42:03.8295615Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8295961Z FAILED [8.7146s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8296009Z Traceback (most recent call last): 2025-12-04T12:42:03.8296172Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8296215Z getattr(self, test_name)() 2025-12-04T12:42:03.8296376Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8296410Z fn() 2025-12-04T12:42:03.8296561Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8296603Z method(*args, **kwargs) 2025-12-04T12:42:03.8296753Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8296794Z method(*args, **kwargs) 2025-12-04T12:42:03.8296943Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8296979Z with policy(): 2025-12-04T12:42:03.8297130Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8297172Z raise RuntimeError(msg) 2025-12-04T12:42:03.8297616Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8297620Z 2025-12-04T12:42:03.8297693Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8298029Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8298032Z 2025-12-04T12:42:03.8298117Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8298119Z 2025-12-04T12:42:03.8298233Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8298278Z Traceback (most recent call last): 2025-12-04T12:42:03.8298442Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8298483Z getattr(self, test_name)() 2025-12-04T12:42:03.8298644Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8298691Z fn() 2025-12-04T12:42:03.8298842Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8298896Z method(*args, **kwargs) 2025-12-04T12:42:03.8299046Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8299087Z method(*args, **kwargs) 2025-12-04T12:42:03.8299236Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8299274Z with policy(): 2025-12-04T12:42:03.8299424Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8299466Z raise RuntimeError(msg) 2025-12-04T12:42:03.8299894Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440. 2025-12-04T12:42:03.8299898Z 2025-12-04T12:42:03.8299970Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8300305Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8300308Z 2025-12-04T12:42:03.8300394Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8300458Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8300521Z ======================= 1 failed, 6 deselected in 8.85s ======================== 2025-12-04T12:42:03.8300559Z Got exit code 1 2025-12-04T12:42:03.8300599Z Retrying single test... 2025-12-04T12:42:03.8300828Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ab6be7b832dcdd81.xml 2025-12-04T12:42:03.8300884Z ============================= test session starts ============================== 2025-12-04T12:42:03.8300997Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8301039Z cachedir: .pytest_cache 2025-12-04T12:42:03.8301197Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8301241Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8301282Z configfile: pytest.ini 2025-12-04T12:42:03.8301469Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8301829Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8301879Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8302233Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8302291Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8302348Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8302679Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8302734Z Running 1 items in this shard 2025-12-04T12:42:03.8302746Z 2025-12-04T12:42:03.8303156Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:37:08.412000 469445 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 469514 2025-12-04T12:42:03.8303311Z I1204 12:37:08.413000 469445 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 469515 2025-12-04T12:42:03.8303463Z I1204 12:37:08.414000 469445 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 469516 2025-12-04T12:42:03.8303614Z I1204 12:37:08.415000 469445 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 469517 2025-12-04T12:42:03.8304297Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8304343Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8305014Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8305058Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8305731Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8305772Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8306280Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8306330Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8307010Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8307052Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8307547Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8307614Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8308105Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8308182Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8308670Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8308718Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8308852Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8309009Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8309295Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8309441Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8309722Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8309840Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8310111Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8310253Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8310537Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8310680Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8310950Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8311080Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8311364Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8311505Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8312059Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8312192Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8312383Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8312838Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8312947Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8313152Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8313311Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8313440Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8313592Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8313871Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8314019Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8314295Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8314409Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8314678Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8314818Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8315105Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8315245Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8315516Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8315668Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8315937Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8316079Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8316640Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8316760Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8316947Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8317403Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8317512Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8317715Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8317875Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8318003Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8318188Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8318466Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8318613Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8318931Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8319047Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8319329Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8319469Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8319737Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8319876Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8320157Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8320285Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8320554Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8320708Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8321272Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8321380Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8321568Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8322024Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8322132Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8322333Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8322490Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8322621Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8322775Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8323057Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8323204Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8323482Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8323606Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8323876Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8324017Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8324283Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8324432Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8324700Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8324826Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8325106Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8325264Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8325817Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8325924Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8326113Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8326571Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8326678Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8326879Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8327036Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8327076Z FAILED [8.6147s] [100%] 2025-12-04T12:42:03.8327079Z 2025-12-04T12:42:03.8327135Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8327316Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8327363Z Traceback (most recent call last): 2025-12-04T12:42:03.8327525Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8327570Z self._join_processes(fn) 2025-12-04T12:42:03.8327742Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8327809Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8327986Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8328031Z raise RuntimeError(error) 2025-12-04T12:42:03.8328112Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8328186Z Traceback (most recent call last): 2025-12-04T12:42:03.8328346Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8328388Z getattr(self, test_name)() 2025-12-04T12:42:03.8328560Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8328595Z fn() 2025-12-04T12:42:03.8328747Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8328787Z method(*args, **kwargs) 2025-12-04T12:42:03.8328937Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8328990Z method(*args, **kwargs) 2025-12-04T12:42:03.8329158Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8329195Z with policy(): 2025-12-04T12:42:03.8329347Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8329386Z raise RuntimeError(msg) 2025-12-04T12:42:03.8329818Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8329821Z 2025-12-04T12:42:03.8329895Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8330238Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8330241Z 2025-12-04T12:42:03.8330328Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8330330Z 2025-12-04T12:42:03.8330332Z 2025-12-04T12:42:03.8330409Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8330497Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8330769Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ab6be7b832dcdd81.xml - 2025-12-04T12:42:03.8330830Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8331178Z FAILED [8.6147s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8331226Z Traceback (most recent call last): 2025-12-04T12:42:03.8331389Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8331433Z getattr(self, test_name)() 2025-12-04T12:42:03.8331591Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8331626Z fn() 2025-12-04T12:42:03.8331788Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8331829Z method(*args, **kwargs) 2025-12-04T12:42:03.8331981Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8332020Z method(*args, **kwargs) 2025-12-04T12:42:03.8332168Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8332207Z with policy(): 2025-12-04T12:42:03.8332358Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8332408Z raise RuntimeError(msg) 2025-12-04T12:42:03.8332843Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8332857Z 2025-12-04T12:42:03.8332931Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8333269Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8333282Z 2025-12-04T12:42:03.8333368Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8333515Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8333577Z ======================= 1 failed, 14 deselected in 8.78s ======================= 2025-12-04T12:42:03.8333615Z Got exit code 1 2025-12-04T12:42:03.8333654Z Retrying single test... 2025-12-04T12:42:03.8333880Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-58416b2f8055c3c6.xml 2025-12-04T12:42:03.8333938Z ============================= test session starts ============================== 2025-12-04T12:42:03.8334053Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8334095Z cachedir: .pytest_cache 2025-12-04T12:42:03.8334252Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8334298Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8334337Z configfile: pytest.ini 2025-12-04T12:42:03.8334503Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8334864Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8334916Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8335261Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8335320Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8335375Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8335705Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8335749Z Running 1 items in this shard 2025-12-04T12:42:03.8335752Z 2025-12-04T12:42:03.8336168Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:37:19.658000 469847 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 469916 2025-12-04T12:42:03.8336326Z I1204 12:37:19.659000 469847 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 469917 2025-12-04T12:42:03.8336477Z I1204 12:37:19.660000 469847 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 469918 2025-12-04T12:42:03.8336639Z I1204 12:37:19.661000 469847 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 469919 2025-12-04T12:42:03.8337322Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8337384Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8338057Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8338098Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8338632Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8338681Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8339353Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8339395Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8339890Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8339939Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8340429Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8340476Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8341162Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8341205Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8341708Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8341756Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8341893Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8342066Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8342365Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8342514Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8342794Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8342911Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8343180Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8343324Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8343592Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8343733Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8344002Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8344132Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8344403Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8344543Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8345114Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8345227Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8345415Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8345881Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8345990Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8346194Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8346351Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8346499Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8346661Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8346941Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8347088Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8347366Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8347483Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8347751Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8347892Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8348196Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8348338Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8348607Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8348736Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8349006Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8349146Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8349713Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1254096896 and is now 2820669440. 2025-12-04T12:42:03.8349822Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8350011Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8350477Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8350586Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8350799Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8350967Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8351097Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8351249Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8351530Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8351677Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8351954Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8352072Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8352339Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8352479Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8352746Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8352887Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8353153Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8353281Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8353551Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8353701Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8354252Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8354361Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8354558Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8355013Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8355129Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8355341Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8355497Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8355627Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8355777Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8356057Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8356204Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8356482Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8356597Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8356864Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8357004Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8357272Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8357412Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8357679Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8357806Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8358087Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8358255Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8358825Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8358932Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8359122Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8359575Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8359707Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8359910Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8360065Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8360104Z FAILED [8.7148s] [100%] 2025-12-04T12:42:03.8360106Z 2025-12-04T12:42:03.8360162Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8360346Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8360393Z Traceback (most recent call last): 2025-12-04T12:42:03.8360558Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8360601Z self._join_processes(fn) 2025-12-04T12:42:03.8360774Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8360828Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8361008Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8361050Z raise RuntimeError(error) 2025-12-04T12:42:03.8361133Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8361179Z Traceback (most recent call last): 2025-12-04T12:42:03.8361339Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8361382Z getattr(self, test_name)() 2025-12-04T12:42:03.8361541Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8361576Z fn() 2025-12-04T12:42:03.8361726Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8361767Z method(*args, **kwargs) 2025-12-04T12:42:03.8361917Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8361957Z method(*args, **kwargs) 2025-12-04T12:42:03.8362121Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8362161Z with policy(): 2025-12-04T12:42:03.8362311Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8362353Z raise RuntimeError(msg) 2025-12-04T12:42:03.8362801Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8362803Z 2025-12-04T12:42:03.8362880Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8363220Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8363233Z 2025-12-04T12:42:03.8363321Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8363334Z 2025-12-04T12:42:03.8363395Z Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8363440Z Traceback (most recent call last): 2025-12-04T12:42:03.8363604Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8363646Z getattr(self, test_name)() 2025-12-04T12:42:03.8363808Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8363841Z fn() 2025-12-04T12:42:03.8363992Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8364033Z method(*args, **kwargs) 2025-12-04T12:42:03.8364184Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8364224Z method(*args, **kwargs) 2025-12-04T12:42:03.8364376Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8364416Z with policy(): 2025-12-04T12:42:03.8364568Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8364608Z raise RuntimeError(msg) 2025-12-04T12:42:03.8365041Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1254096896 and is now 2820669440. 2025-12-04T12:42:03.8365044Z 2025-12-04T12:42:03.8365119Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8365455Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8365458Z 2025-12-04T12:42:03.8365545Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8365547Z 2025-12-04T12:42:03.8365549Z 2025-12-04T12:42:03.8365625Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8365714Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8365996Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-58416b2f8055c3c6.xml - 2025-12-04T12:42:03.8366058Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8366404Z FAILED [8.7148s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8366453Z Traceback (most recent call last): 2025-12-04T12:42:03.8366616Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8366668Z getattr(self, test_name)() 2025-12-04T12:42:03.8366830Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8366863Z fn() 2025-12-04T12:42:03.8367017Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8367055Z method(*args, **kwargs) 2025-12-04T12:42:03.8367217Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8367267Z method(*args, **kwargs) 2025-12-04T12:42:03.8367417Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8367453Z with policy(): 2025-12-04T12:42:03.8367605Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8367646Z raise RuntimeError(msg) 2025-12-04T12:42:03.8368083Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8368086Z 2025-12-04T12:42:03.8368196Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8368532Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8368535Z 2025-12-04T12:42:03.8368623Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8368625Z 2025-12-04T12:42:03.8368684Z Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8368730Z Traceback (most recent call last): 2025-12-04T12:42:03.8368891Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8368934Z getattr(self, test_name)() 2025-12-04T12:42:03.8369094Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8369129Z fn() 2025-12-04T12:42:03.8369316Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8369358Z method(*args, **kwargs) 2025-12-04T12:42:03.8369507Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8369547Z method(*args, **kwargs) 2025-12-04T12:42:03.8369697Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8369734Z with policy(): 2025-12-04T12:42:03.8369887Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8369929Z raise RuntimeError(msg) 2025-12-04T12:42:03.8370375Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1254096896 and is now 2820669440. 2025-12-04T12:42:03.8370380Z 2025-12-04T12:42:03.8370453Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8370802Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8370804Z 2025-12-04T12:42:03.8370889Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8370953Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8371015Z ======================= 1 failed, 14 deselected in 8.85s ======================= 2025-12-04T12:42:03.8371065Z Got exit code 1 2025-12-04T12:42:03.8371348Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8371491Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.8371720Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-caa2a66b5eb1de6f.xml 2025-12-04T12:42:03.8371777Z ============================= test session starts ============================== 2025-12-04T12:42:03.8371890Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8371931Z cachedir: .pytest_cache 2025-12-04T12:42:03.8372089Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8372136Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8372176Z configfile: pytest.ini 2025-12-04T12:42:03.8372341Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8372705Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8372756Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8373103Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8373160Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8373216Z collected 15 items / 7 deselected / 8 selected 2025-12-04T12:42:03.8373268Z stepcurrent: skipping 7 already run items. 2025-12-04T12:42:03.8373313Z Running 8 items in this shard 2025-12-04T12:42:03.8373315Z 2025-12-04T12:42:03.8373721Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:37:30.893000 470249 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 470318 2025-12-04T12:42:03.8373879Z I1204 12:37:30.893000 470249 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 470319 2025-12-04T12:42:03.8374043Z I1204 12:37:30.894000 470249 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 470320 2025-12-04T12:42:03.8374195Z I1204 12:37:30.895000 470249 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 470321 2025-12-04T12:42:03.8374879Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8374932Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8375608Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8375670Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8376339Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8376381Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8376879Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8376929Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8377419Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8377466Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8378184Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8378226Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8378718Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8378764Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8379267Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8379317Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8379451Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8379607Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8379901Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8380051Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8380345Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8380473Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8380744Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8380885Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8381155Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8381295Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8381564Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8381694Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8381967Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8382108Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8382667Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 958398464 and is now 2820669440. 2025-12-04T12:42:03.8382779Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8382970Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8383435Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8383546Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8383748Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8383906Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8384047Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8384201Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8384483Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8384642Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8384930Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8385045Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8385314Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8385454Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8385723Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8385862Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8386130Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8386258Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8386529Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8386670Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8387219Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8387329Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8387517Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8387983Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8388093Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8388331Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8388505Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8388636Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8388790Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8389082Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8389242Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8389519Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8389635Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8389904Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8390044Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8390312Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8390451Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8390719Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8390848Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8391122Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8391264Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8391814Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1256194048 and is now 2820669440. 2025-12-04T12:42:03.8391934Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8392123Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8392578Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8392686Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8392898Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8393057Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8393186Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8393348Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8393643Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8393790Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8394066Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8394182Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8394449Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8394590Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8394859Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8394997Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8395264Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8395392Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8395666Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8395806Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8396364Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8396473Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8396662Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8397126Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8397233Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8397437Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8397605Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8397655Z FAILED [8.7134s] [ 12%] 2025-12-04T12:42:03.8397658Z 2025-12-04T12:42:03.8397715Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8397895Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8397943Z Traceback (most recent call last): 2025-12-04T12:42:03.8398107Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8398192Z self._join_processes(fn) 2025-12-04T12:42:03.8398386Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8398458Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8398637Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8398682Z raise RuntimeError(error) 2025-12-04T12:42:03.8398765Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8398811Z Traceback (most recent call last): 2025-12-04T12:42:03.8398971Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8399015Z getattr(self, test_name)() 2025-12-04T12:42:03.8399173Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8399208Z fn() 2025-12-04T12:42:03.8399359Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8399402Z method(*args, **kwargs) 2025-12-04T12:42:03.8399552Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8399594Z method(*args, **kwargs) 2025-12-04T12:42:03.8399743Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8399781Z with policy(): 2025-12-04T12:42:03.8399934Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8399975Z raise RuntimeError(msg) 2025-12-04T12:42:03.8400429Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1256194048 and is now 2820669440. 2025-12-04T12:42:03.8400432Z 2025-12-04T12:42:03.8400509Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8400849Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8400852Z 2025-12-04T12:42:03.8400938Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8400941Z 2025-12-04T12:42:03.8401013Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8401059Z Traceback (most recent call last): 2025-12-04T12:42:03.8401224Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8401267Z getattr(self, test_name)() 2025-12-04T12:42:03.8401428Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8401480Z fn() 2025-12-04T12:42:03.8401630Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8401687Z method(*args, **kwargs) 2025-12-04T12:42:03.8401836Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8401876Z method(*args, **kwargs) 2025-12-04T12:42:03.8402025Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8402064Z with policy(): 2025-12-04T12:42:03.8402215Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8402258Z raise RuntimeError(msg) 2025-12-04T12:42:03.8402685Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 958398464 and is now 2820669440. 2025-12-04T12:42:03.8402689Z 2025-12-04T12:42:03.8402763Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8403102Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8403106Z 2025-12-04T12:42:03.8403192Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8403194Z 2025-12-04T12:42:03.8403196Z 2025-12-04T12:42:03.8403273Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8403361Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8403633Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-caa2a66b5eb1de6f.xml - 2025-12-04T12:42:03.8403694Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8404041Z FAILED [8.7134s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8404089Z Traceback (most recent call last): 2025-12-04T12:42:03.8404265Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8404307Z getattr(self, test_name)() 2025-12-04T12:42:03.8404469Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8404503Z fn() 2025-12-04T12:42:03.8404655Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8404693Z method(*args, **kwargs) 2025-12-04T12:42:03.8404844Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8404884Z method(*args, **kwargs) 2025-12-04T12:42:03.8405042Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8405081Z with policy(): 2025-12-04T12:42:03.8405233Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8405274Z raise RuntimeError(msg) 2025-12-04T12:42:03.8405716Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1256194048 and is now 2820669440. 2025-12-04T12:42:03.8405730Z 2025-12-04T12:42:03.8405805Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8406140Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8406142Z 2025-12-04T12:42:03.8406228Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8406231Z 2025-12-04T12:42:03.8406290Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8406335Z Traceback (most recent call last): 2025-12-04T12:42:03.8406498Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8406540Z getattr(self, test_name)() 2025-12-04T12:42:03.8406701Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8406735Z fn() 2025-12-04T12:42:03.8406886Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8406925Z method(*args, **kwargs) 2025-12-04T12:42:03.8407076Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8407115Z method(*args, **kwargs) 2025-12-04T12:42:03.8407265Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8407302Z with policy(): 2025-12-04T12:42:03.8407453Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8407494Z raise RuntimeError(msg) 2025-12-04T12:42:03.8407925Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 958398464 and is now 2820669440. 2025-12-04T12:42:03.8407928Z 2025-12-04T12:42:03.8408000Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8408390Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8408394Z 2025-12-04T12:42:03.8408481Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8408546Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8408609Z ======================= 1 failed, 7 deselected in 8.85s ======================== 2025-12-04T12:42:03.8408646Z Got exit code 1 2025-12-04T12:42:03.8408686Z Retrying single test... 2025-12-04T12:42:03.8408927Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-79558c292143c3e0.xml 2025-12-04T12:42:03.8408987Z ============================= test session starts ============================== 2025-12-04T12:42:03.8409100Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8409142Z cachedir: .pytest_cache 2025-12-04T12:42:03.8409300Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8409365Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8409419Z configfile: pytest.ini 2025-12-04T12:42:03.8409581Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8409939Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8409990Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8410336Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8410394Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8410449Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8410779Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8410823Z Running 1 items in this shard 2025-12-04T12:42:03.8410825Z 2025-12-04T12:42:03.8411229Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:37:42.079000 470651 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 470720 2025-12-04T12:42:03.8411386Z I1204 12:37:42.079000 470651 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 470721 2025-12-04T12:42:03.8411538Z I1204 12:37:42.080000 470651 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 470722 2025-12-04T12:42:03.8411690Z I1204 12:37:42.081000 470651 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 470723 2025-12-04T12:42:03.8412370Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8412413Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8413097Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8413141Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8413819Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8413871Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8414541Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8414593Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8415093Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8415142Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8415633Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8415681Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8416170Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8416216Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8416707Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8416754Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8416889Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8417044Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8417335Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8417483Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8417761Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8417888Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8418198Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8418342Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8418625Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8418777Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8419048Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8419177Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8419447Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8419588Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8420184Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8420295Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8420486Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8420943Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8421052Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8421258Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8421417Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8421548Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8421713Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8421992Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8422142Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8422432Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8422549Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8422817Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8422967Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8423246Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8423385Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8423654Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8423783Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8424054Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8424194Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8424743Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8424852Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8425040Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8425496Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8425606Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8425811Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8425977Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8426109Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8426262Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8426542Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8426703Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8426982Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8427097Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8427376Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8427528Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8427796Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8427936Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8428240Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8428367Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8428638Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8428778Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8429328Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8429437Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8429625Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8430081Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8430187Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8430408Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8430566Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8430698Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8430849Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8431141Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8431288Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8431565Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8431704Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8431971Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8432114Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8432380Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8432522Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8432793Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8432921Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8433191Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8433330Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8433877Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8433985Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8434174Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8434638Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8434745Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8434948Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8435105Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8435145Z FAILED [8.6136s] [100%] 2025-12-04T12:42:03.8435147Z 2025-12-04T12:42:03.8435212Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8435395Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8435444Z Traceback (most recent call last): 2025-12-04T12:42:03.8435609Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8435668Z self._join_processes(fn) 2025-12-04T12:42:03.8435841Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8435906Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8436083Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8436126Z raise RuntimeError(error) 2025-12-04T12:42:03.8436206Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8436253Z Traceback (most recent call last): 2025-12-04T12:42:03.8436412Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8436455Z getattr(self, test_name)() 2025-12-04T12:42:03.8436614Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8436651Z fn() 2025-12-04T12:42:03.8436801Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8436843Z method(*args, **kwargs) 2025-12-04T12:42:03.8436992Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8437032Z method(*args, **kwargs) 2025-12-04T12:42:03.8437183Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8437221Z with policy(): 2025-12-04T12:42:03.8437372Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8437413Z raise RuntimeError(msg) 2025-12-04T12:42:03.8437844Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8437848Z 2025-12-04T12:42:03.8437922Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8438292Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8438295Z 2025-12-04T12:42:03.8438381Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8438383Z 2025-12-04T12:42:03.8438385Z 2025-12-04T12:42:03.8438477Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8438565Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8438835Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-79558c292143c3e0.xml - 2025-12-04T12:42:03.8438895Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8439252Z FAILED [8.6136s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8439300Z Traceback (most recent call last): 2025-12-04T12:42:03.8439463Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8439506Z getattr(self, test_name)() 2025-12-04T12:42:03.8439666Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8439713Z fn() 2025-12-04T12:42:03.8439878Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8439918Z method(*args, **kwargs) 2025-12-04T12:42:03.8440067Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8440106Z method(*args, **kwargs) 2025-12-04T12:42:03.8440255Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8440293Z with policy(): 2025-12-04T12:42:03.8440443Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8440484Z raise RuntimeError(msg) 2025-12-04T12:42:03.8440914Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8440918Z 2025-12-04T12:42:03.8440992Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8441328Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8441330Z 2025-12-04T12:42:03.8441416Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8441479Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8441540Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.8441578Z Got exit code 1 2025-12-04T12:42:03.8441618Z Retrying single test... 2025-12-04T12:42:03.8441844Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-61029b26705589f5.xml 2025-12-04T12:42:03.8441901Z ============================= test session starts ============================== 2025-12-04T12:42:03.8442015Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8442057Z cachedir: .pytest_cache 2025-12-04T12:42:03.8442215Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8442260Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8442299Z configfile: pytest.ini 2025-12-04T12:42:03.8442480Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8442838Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8442889Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8443244Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8443302Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8443358Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8443685Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8443739Z Running 1 items in this shard 2025-12-04T12:42:03.8443751Z 2025-12-04T12:42:03.8444155Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:37:53.265000 471053 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 471122 2025-12-04T12:42:03.8444310Z I1204 12:37:53.265000 471053 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 471123 2025-12-04T12:42:03.8444463Z I1204 12:37:53.266000 471053 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 471124 2025-12-04T12:42:03.8444614Z I1204 12:37:53.267000 471053 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 471125 2025-12-04T12:42:03.8445294Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8445339Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8445837Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8445886Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8446557Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8446600Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8447282Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8447326Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8447828Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8447875Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8448578Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8448647Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8449136Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8449183Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8449676Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8449723Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8449859Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8450014Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8450297Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8450443Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8450721Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8450838Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8451108Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8451251Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8451530Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8451671Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8451939Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8452069Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8452351Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8452493Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8453046Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2820669440. 2025-12-04T12:42:03.8453176Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8453365Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8453820Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8453928Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8454132Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8454291Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8454421Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8454573Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8454854Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8455002Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8455279Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8455393Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8455662Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8455801Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8456078Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8462845Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8463162Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8463348Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8463625Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8463773Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8464350Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8464487Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8464681Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8465142Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8465257Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8465462Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8465624Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8465757Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8465912Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8466199Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8466349Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8466631Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8466748Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8467036Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8467179Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8467451Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8467593Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8467872Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8468004Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8468323Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8468479Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8469056Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8469169Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8469360Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8469819Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8469929Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8470133Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8470304Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8470471Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8470626Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8470909Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8471055Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8471337Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8471466Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8471736Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8471879Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8472148Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8472301Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8472574Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8472714Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8472983Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8473176Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8473754Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440. 2025-12-04T12:42:03.8473862Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8474053Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8474513Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8474620Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8474824Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8474982Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8475025Z FAILED [8.7134s] [100%] 2025-12-04T12:42:03.8475030Z 2025-12-04T12:42:03.8475091Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8475274Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8475325Z Traceback (most recent call last): 2025-12-04T12:42:03.8475491Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8475537Z self._join_processes(fn) 2025-12-04T12:42:03.8475711Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8475784Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8475965Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8476011Z raise RuntimeError(error) 2025-12-04T12:42:03.8476096Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8476143Z Traceback (most recent call last): 2025-12-04T12:42:03.8476306Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8476349Z getattr(self, test_name)() 2025-12-04T12:42:03.8476521Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8476557Z fn() 2025-12-04T12:42:03.8476710Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8476752Z method(*args, **kwargs) 2025-12-04T12:42:03.8476904Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8476957Z method(*args, **kwargs) 2025-12-04T12:42:03.8477126Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8477164Z with policy(): 2025-12-04T12:42:03.8477318Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8477360Z raise RuntimeError(msg) 2025-12-04T12:42:03.8477794Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2820669440. 2025-12-04T12:42:03.8477797Z 2025-12-04T12:42:03.8477875Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8478263Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8478267Z 2025-12-04T12:42:03.8478358Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8478360Z 2025-12-04T12:42:03.8478362Z 2025-12-04T12:42:03.8478442Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8478532Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8478808Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-61029b26705589f5.xml - 2025-12-04T12:42:03.8478872Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8479223Z FAILED [8.7134s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8479271Z Traceback (most recent call last): 2025-12-04T12:42:03.8479439Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8479483Z getattr(self, test_name)() 2025-12-04T12:42:03.8500205Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8500252Z fn() 2025-12-04T12:42:03.8500429Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8500471Z method(*args, **kwargs) 2025-12-04T12:42:03.8500626Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8500666Z method(*args, **kwargs) 2025-12-04T12:42:03.8500815Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8500852Z with policy(): 2025-12-04T12:42:03.8501004Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8501057Z raise RuntimeError(msg) 2025-12-04T12:42:03.8501494Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2820669440. 2025-12-04T12:42:03.8501511Z 2025-12-04T12:42:03.8501588Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8501926Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8501942Z 2025-12-04T12:42:03.8502032Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8502097Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8502161Z ======================= 1 failed, 14 deselected in 8.85s ======================= 2025-12-04T12:42:03.8502198Z Got exit code 1 2025-12-04T12:42:03.8502486Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8502615Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.8502845Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-abf0463f5e31e225.xml 2025-12-04T12:42:03.8502904Z ============================= test session starts ============================== 2025-12-04T12:42:03.8503018Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8503059Z cachedir: .pytest_cache 2025-12-04T12:42:03.8503219Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8503265Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8503306Z configfile: pytest.ini 2025-12-04T12:42:03.8503471Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8503832Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8503885Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8504231Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8504289Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8504342Z collected 15 items / 8 deselected / 7 selected 2025-12-04T12:42:03.8504405Z stepcurrent: skipping 8 already run items. 2025-12-04T12:42:03.8504449Z Running 7 items in this shard 2025-12-04T12:42:03.8504451Z 2025-12-04T12:42:03.8504877Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:38:04.522000 471455 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 471524 2025-12-04T12:42:03.8505034Z I1204 12:38:04.523000 471455 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 471525 2025-12-04T12:42:03.8505197Z I1204 12:38:04.523000 471455 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 471526 2025-12-04T12:42:03.8505347Z I1204 12:38:04.524000 471455 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 471527 2025-12-04T12:42:03.8506033Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8506100Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8506771Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8506814Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8507482Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8507524Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8508236Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8508277Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8508779Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8508829Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8509341Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8509388Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8509875Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8509922Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8510424Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8510470Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8511158Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8511213Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8511881Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8511923Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8512588Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8512629Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8513296Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8513339Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8513829Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8513889Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8514386Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8514444Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8514940Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8514997Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8515484Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8515567Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8515805Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8515850Z local_shape = tensor.shape 2025-12-04T12:42:03.8516083Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8516126Z local_shape = tensor.shape 2025-12-04T12:42:03.8516358Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8516395Z tensor.shape, 2025-12-04T12:42:03.8516626Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8516664Z tensor.dtype, 2025-12-04T12:42:03.8516894Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8516929Z tensor.shape, 2025-12-04T12:42:03.8517160Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8517195Z tensor.dtype, 2025-12-04T12:42:03.8517427Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8517468Z local_shape = tensor.shape 2025-12-04T12:42:03.8517701Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8517737Z tensor.shape, 2025-12-04T12:42:03.8517968Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8518003Z tensor.dtype, 2025-12-04T12:42:03.8518272Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8518316Z local_shape = tensor.shape 2025-12-04T12:42:03.8518559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8518598Z tensor.shape, 2025-12-04T12:42:03.8518828Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8518865Z tensor.dtype, 2025-12-04T12:42:03.8519002Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8519173Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8519457Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8519605Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8519898Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8520030Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8520303Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8520444Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8520715Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8520864Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8521162Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8521292Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8521562Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8521704Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8522271Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1260388352 and is now 2850029568. 2025-12-04T12:42:03.8522384Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8522576Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8523058Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8523168Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8523371Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8523531Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8523671Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8523825Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8524104Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8524263Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8524553Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8524669Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8524943Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8525084Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8525353Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8525492Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8525761Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8525889Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8526159Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8526301Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8526874Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8526982Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8527183Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8527654Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8527763Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8527975Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8528133Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8528298Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8528453Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8528744Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8528906Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8529187Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8529302Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8529570Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8529711Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8529980Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8530119Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8530386Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8530515Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8530784Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8530925Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8531487Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8531607Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8531796Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8532265Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8532392Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8532594Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8532752Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8532893Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8533055Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8533333Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8533481Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8533759Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8533876Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8534145Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8534285Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8534554Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8534694Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8534964Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8535091Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8535362Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8535504Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8536080Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T12:42:03.8536189Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8536378Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8536858Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8536968Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8537170Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8537338Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8537388Z FAILED [8.7151s] [ 14%] 2025-12-04T12:42:03.8537391Z 2025-12-04T12:42:03.8537449Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8537644Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8537694Z Traceback (most recent call last): 2025-12-04T12:42:03.8537856Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8537900Z self._join_processes(fn) 2025-12-04T12:42:03.8538073Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8538129Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8538350Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8538395Z raise RuntimeError(error) 2025-12-04T12:42:03.8538474Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8538521Z Traceback (most recent call last): 2025-12-04T12:42:03.8538684Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8538727Z getattr(self, test_name)() 2025-12-04T12:42:03.8538889Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8538924Z fn() 2025-12-04T12:42:03.8539076Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8539117Z method(*args, **kwargs) 2025-12-04T12:42:03.8539269Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8539309Z method(*args, **kwargs) 2025-12-04T12:42:03.8539459Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8539495Z with policy(): 2025-12-04T12:42:03.8539647Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8539688Z raise RuntimeError(msg) 2025-12-04T12:42:03.8540150Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8540154Z 2025-12-04T12:42:03.8540229Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8540582Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8540584Z 2025-12-04T12:42:03.8540687Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8540689Z 2025-12-04T12:42:03.8540691Z 2025-12-04T12:42:03.8540767Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8540858Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8541134Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-abf0463f5e31e225.xml - 2025-12-04T12:42:03.8541230Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8541589Z FAILED [8.7151s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8541636Z Traceback (most recent call last): 2025-12-04T12:42:03.8541800Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8541844Z getattr(self, test_name)() 2025-12-04T12:42:03.8542005Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8542041Z fn() 2025-12-04T12:42:03.8542193Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8542234Z method(*args, **kwargs) 2025-12-04T12:42:03.8542385Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8542424Z method(*args, **kwargs) 2025-12-04T12:42:03.8542575Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8542612Z with policy(): 2025-12-04T12:42:03.8542764Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8542804Z raise RuntimeError(msg) 2025-12-04T12:42:03.8543253Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8543257Z 2025-12-04T12:42:03.8543332Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8543683Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8543685Z 2025-12-04T12:42:03.8543773Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8543837Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8543910Z ======================= 1 failed, 8 deselected in 8.85s ======================== 2025-12-04T12:42:03.8543946Z Got exit code 1 2025-12-04T12:42:03.8543987Z Retrying single test... 2025-12-04T12:42:03.8544213Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-75b1f48c62bfaf0e.xml 2025-12-04T12:42:03.8544272Z ============================= test session starts ============================== 2025-12-04T12:42:03.8544385Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8544426Z cachedir: .pytest_cache 2025-12-04T12:42:03.8544597Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8544644Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8544683Z configfile: pytest.ini 2025-12-04T12:42:03.8544847Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8545204Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8545280Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8545629Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8545688Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8545745Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8546083Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8546128Z Running 1 items in this shard 2025-12-04T12:42:03.8546131Z 2025-12-04T12:42:03.8546548Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:38:15.815000 471857 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 471926 2025-12-04T12:42:03.8546704Z I1204 12:38:15.816000 471857 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 471927 2025-12-04T12:42:03.8546855Z I1204 12:38:15.816000 471857 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 471928 2025-12-04T12:42:03.8547008Z I1204 12:38:15.817000 471857 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 471929 2025-12-04T12:42:03.8547689Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8547735Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8548467Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8548510Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8549181Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8549251Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8549918Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8549986Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8550485Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8550535Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8551028Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8551075Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8551567Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8551613Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8552101Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8552147Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8552823Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8552865Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8553541Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8553585Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8554266Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8554307Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8554798Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8554878Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8555364Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8555422Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8556092Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8556135Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8556623Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8556680Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8557164Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8557223Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8557460Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8557504Z local_shape = tensor.shape 2025-12-04T12:42:03.8557739Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8557775Z tensor.shape, 2025-12-04T12:42:03.8558018Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8558061Z local_shape = tensor.shape 2025-12-04T12:42:03.8558334Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8558370Z tensor.dtype, 2025-12-04T12:42:03.8558618Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8558654Z tensor.shape, 2025-12-04T12:42:03.8558885Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8558921Z tensor.dtype, 2025-12-04T12:42:03.8559153Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8559210Z local_shape = tensor.shape 2025-12-04T12:42:03.8559454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8559492Z tensor.shape, 2025-12-04T12:42:03.8559723Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8559759Z tensor.dtype, 2025-12-04T12:42:03.8559988Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8560032Z local_shape = tensor.shape 2025-12-04T12:42:03.8560261Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8560300Z tensor.shape, 2025-12-04T12:42:03.8560529Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8560566Z tensor.dtype, 2025-12-04T12:42:03.8560701Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8560857Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8561141Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8561290Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8561570Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8561688Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8561959Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8562100Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8562380Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8562520Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8562790Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8562930Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8563201Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8563343Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8563919Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1256194048 and is now 2850029568. 2025-12-04T12:42:03.8564047Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8564235Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8564706Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8564816Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8565022Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8565185Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8565315Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8565469Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8565748Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8565896Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8566173Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8566288Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8566571Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8566712Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8566980Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8567119Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8567399Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8567528Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8567798Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8567962Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8568576Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T12:42:03.8568686Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8568877Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8569348Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8569458Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8569662Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8569821Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8569951Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8570109Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8570389Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8570538Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8570818Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8570947Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8571220Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8571363Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8571690Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8571829Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8572100Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8572241Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8572514Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8572669Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8573230Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8573338Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8573527Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8573998Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8574107Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8574309Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8574467Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8574597Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8574749Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8575028Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8575175Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8575464Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8575581Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8575853Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8575993Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8576272Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8576411Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8576691Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8576827Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8577102Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8577245Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8577806Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8577916Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8578103Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8578618Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8578726Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8578928Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8579086Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8579125Z FAILED [8.6140s] [100%] 2025-12-04T12:42:03.8579128Z 2025-12-04T12:42:03.8579185Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8579379Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8579426Z Traceback (most recent call last): 2025-12-04T12:42:03.8579603Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8579647Z self._join_processes(fn) 2025-12-04T12:42:03.8579821Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8579877Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8580054Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8580098Z raise RuntimeError(error) 2025-12-04T12:42:03.8580177Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8580238Z Traceback (most recent call last): 2025-12-04T12:42:03.8580399Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8580442Z getattr(self, test_name)() 2025-12-04T12:42:03.8580601Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8580655Z fn() 2025-12-04T12:42:03.8580807Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8580862Z method(*args, **kwargs) 2025-12-04T12:42:03.8581013Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8581054Z method(*args, **kwargs) 2025-12-04T12:42:03.8581207Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8581245Z with policy(): 2025-12-04T12:42:03.8581397Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8581439Z raise RuntimeError(msg) 2025-12-04T12:42:03.8581886Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8581890Z 2025-12-04T12:42:03.8581966Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8582315Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8582317Z 2025-12-04T12:42:03.8582407Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8582409Z 2025-12-04T12:42:03.8582411Z 2025-12-04T12:42:03.8582489Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8582577Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8582849Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-75b1f48c62bfaf0e.xml - 2025-12-04T12:42:03.8582911Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8583269Z FAILED [8.6140s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8583316Z Traceback (most recent call last): 2025-12-04T12:42:03.8583480Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8583534Z getattr(self, test_name)() 2025-12-04T12:42:03.8583694Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8583731Z fn() 2025-12-04T12:42:03.8583882Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8583926Z method(*args, **kwargs) 2025-12-04T12:42:03.8584076Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8584116Z method(*args, **kwargs) 2025-12-04T12:42:03.8584277Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8584314Z with policy(): 2025-12-04T12:42:03.8584467Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8584508Z raise RuntimeError(msg) 2025-12-04T12:42:03.8584957Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8584981Z 2025-12-04T12:42:03.8585056Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8585406Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8585408Z 2025-12-04T12:42:03.8585495Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8585561Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8585623Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.8585662Z Got exit code 1 2025-12-04T12:42:03.8585702Z Retrying single test... 2025-12-04T12:42:03.8585931Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c5295aa9ee49c749.xml 2025-12-04T12:42:03.8585991Z ============================= test session starts ============================== 2025-12-04T12:42:03.8586103Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8586146Z cachedir: .pytest_cache 2025-12-04T12:42:03.8586301Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8586347Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8586387Z configfile: pytest.ini 2025-12-04T12:42:03.8586554Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8586911Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8586962Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8587310Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8587368Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8587424Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8587774Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8587820Z Running 1 items in this shard 2025-12-04T12:42:03.8587823Z 2025-12-04T12:42:03.8588276Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:38:26.884000 472259 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 472328 2025-12-04T12:42:03.8588446Z I1204 12:38:26.885000 472259 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 472329 2025-12-04T12:42:03.8588598Z I1204 12:38:26.886000 472259 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 472330 2025-12-04T12:42:03.8588750Z I1204 12:38:26.886000 472259 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 472331 2025-12-04T12:42:03.8589446Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8589505Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8590181Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8590224Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8590894Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8590936Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8591603Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8591646Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8592144Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8592196Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8592700Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8592750Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8593248Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8593295Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8593780Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8593848Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8594522Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8594565Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8595240Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8595284Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8595950Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8595993Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8596483Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8596543Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8597031Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8597099Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8597774Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8597817Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8598356Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8598414Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8598909Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8598983Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8599223Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8599267Z local_shape = tensor.shape 2025-12-04T12:42:03.8599504Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8599546Z local_shape = tensor.shape 2025-12-04T12:42:03.8599781Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8599818Z tensor.shape, 2025-12-04T12:42:03.8600052Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8600091Z tensor.shape, 2025-12-04T12:42:03.8600323Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8600360Z tensor.dtype, 2025-12-04T12:42:03.8600593Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8600630Z tensor.dtype, 2025-12-04T12:42:03.8600865Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8600907Z local_shape = tensor.shape 2025-12-04T12:42:03.8601140Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8601177Z tensor.shape, 2025-12-04T12:42:03.8601411Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8601450Z tensor.dtype, 2025-12-04T12:42:03.8601701Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8601745Z local_shape = tensor.shape 2025-12-04T12:42:03.8601975Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8602014Z tensor.shape, 2025-12-04T12:42:03.8602245Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8602281Z tensor.dtype, 2025-12-04T12:42:03.8602429Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8602587Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8602916Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8603076Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8603370Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8603487Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8603758Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8603899Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8604169Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8604309Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8604579Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8604709Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8604979Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8605121Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8605688Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1262485504 and is now 2850029568. 2025-12-04T12:42:03.8605799Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8606009Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8606478Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8606590Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8606804Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8606962Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8607093Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8607245Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8607535Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8607696Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8607975Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8608092Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8608399Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8608540Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8608810Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8608949Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8609220Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8609349Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8609622Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8609765Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8610327Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T12:42:03.8610451Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8610642Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8611109Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8611229Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8611432Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8611591Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8611734Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8611900Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8612180Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8612327Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8612604Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8612719Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8612990Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8613130Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8613396Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8613534Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8613803Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8613931Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8614201Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8614342Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8614921Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8615032Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8615222Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8615698Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8615807Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8616008Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8616177Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8616319Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8616473Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8616756Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8616903Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8617183Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8617299Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8617571Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8617711Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8617980Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8618118Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8618432Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8618560Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8618832Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8618972Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8619549Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8619660Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8619868Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8620338Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8620459Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8620674Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8620831Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8620870Z FAILED [8.6165s] [100%] 2025-12-04T12:42:03.8620872Z 2025-12-04T12:42:03.8620932Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8621125Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8621174Z Traceback (most recent call last): 2025-12-04T12:42:03.8621335Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8621381Z self._join_processes(fn) 2025-12-04T12:42:03.8621553Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8621610Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8621787Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8621831Z raise RuntimeError(error) 2025-12-04T12:42:03.8621911Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8621960Z Traceback (most recent call last): 2025-12-04T12:42:03.8622158Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8622203Z getattr(self, test_name)() 2025-12-04T12:42:03.8622363Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8622399Z fn() 2025-12-04T12:42:03.8622551Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8622592Z method(*args, **kwargs) 2025-12-04T12:42:03.8622743Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8622784Z method(*args, **kwargs) 2025-12-04T12:42:03.8622936Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8622973Z with policy(): 2025-12-04T12:42:03.8623125Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8623181Z raise RuntimeError(msg) 2025-12-04T12:42:03.8623627Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1262485504 and is now 2850029568. 2025-12-04T12:42:03.8623632Z 2025-12-04T12:42:03.8623707Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8624067Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8624070Z 2025-12-04T12:42:03.8624158Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8624163Z 2025-12-04T12:42:03.8624166Z 2025-12-04T12:42:03.8624242Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8624342Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8624626Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c5295aa9ee49c749.xml - 2025-12-04T12:42:03.8624687Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8625045Z FAILED [8.6165s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8625092Z Traceback (most recent call last): 2025-12-04T12:42:03.8625256Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8625300Z getattr(self, test_name)() 2025-12-04T12:42:03.8625459Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8625497Z fn() 2025-12-04T12:42:03.8625648Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8625689Z method(*args, **kwargs) 2025-12-04T12:42:03.8625839Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8625880Z method(*args, **kwargs) 2025-12-04T12:42:03.8626029Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8626066Z with policy(): 2025-12-04T12:42:03.8626219Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8626260Z raise RuntimeError(msg) 2025-12-04T12:42:03.8626704Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1262485504 and is now 2850029568. 2025-12-04T12:42:03.8626708Z 2025-12-04T12:42:03.8626781Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8627132Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8627134Z 2025-12-04T12:42:03.8627231Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8627297Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8627361Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.8627400Z Got exit code 1 2025-12-04T12:42:03.8627696Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8627825Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.8628065Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-0d50af9fa3a75953.xml 2025-12-04T12:42:03.8628125Z ============================= test session starts ============================== 2025-12-04T12:42:03.8628278Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8628337Z cachedir: .pytest_cache 2025-12-04T12:42:03.8628496Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8628566Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8628608Z configfile: pytest.ini 2025-12-04T12:42:03.8628771Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8629132Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8629183Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8629533Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8629590Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8629646Z collected 15 items / 9 deselected / 6 selected 2025-12-04T12:42:03.8629698Z stepcurrent: skipping 9 already run items. 2025-12-04T12:42:03.8629743Z Running 6 items in this shard 2025-12-04T12:42:03.8629745Z 2025-12-04T12:42:03.8630165Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:38:38.135000 472661 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 472730 2025-12-04T12:42:03.8630320Z I1204 12:38:38.136000 472661 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 472731 2025-12-04T12:42:03.8630474Z I1204 12:38:38.136000 472661 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 472732 2025-12-04T12:42:03.8630624Z I1204 12:38:38.137000 472661 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 472733 2025-12-04T12:42:03.8631309Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8631353Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8632039Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8632084Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8632765Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8632809Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8633479Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8633542Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8634042Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8634091Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8634586Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8634634Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8635125Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8635172Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8635658Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8635705Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8636388Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8636431Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8637099Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8637154Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8637824Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8637877Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8638413Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8638475Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8638958Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8639017Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8639692Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8639733Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8640221Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8640278Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8640762Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8640820Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8641071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8641116Z local_shape = tensor.shape 2025-12-04T12:42:03.8641352Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8641396Z local_shape = tensor.shape 2025-12-04T12:42:03.8641628Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8641667Z tensor.shape, 2025-12-04T12:42:03.8641910Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8641949Z tensor.shape, 2025-12-04T12:42:03.8642182Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8642226Z local_shape = tensor.shape 2025-12-04T12:42:03.8642470Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8642523Z tensor.dtype, 2025-12-04T12:42:03.8642753Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8642790Z tensor.dtype, 2025-12-04T12:42:03.8643021Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8643056Z tensor.shape, 2025-12-04T12:42:03.8643288Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8643323Z tensor.dtype, 2025-12-04T12:42:03.8643556Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8643598Z local_shape = tensor.shape 2025-12-04T12:42:03.8643832Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8643868Z tensor.shape, 2025-12-04T12:42:03.8644099Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8644135Z tensor.dtype, 2025-12-04T12:42:03.8644271Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8644427Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8644713Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8644861Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8645140Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8645257Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8645538Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8645682Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8645950Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8646101Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8646369Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8646499Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8646796Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8646949Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8647517Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1262485504 and is now 2850029568. 2025-12-04T12:42:03.8647627Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8647817Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8648329Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8648437Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8648641Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8648799Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8648933Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8649086Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8649369Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8649516Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8649808Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8649924Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8650192Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8650333Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8650613Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8650753Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8651021Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8651165Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8651450Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8651593Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8652158Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8652265Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8652455Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8652921Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8653029Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8653233Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8653390Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8653521Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8653672Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8653951Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8654108Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8654390Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8654507Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8654776Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8654927Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8655195Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8655334Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8655613Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8655754Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8656024Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8656166Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8656729Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8656838Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8657028Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8657493Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8657601Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8657802Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8657959Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8658089Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8658274Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8658566Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8658712Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8658989Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8659102Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8659386Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8659527Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8659794Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8659961Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8660227Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8660356Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8660626Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8660766Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8661323Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T12:42:03.8661432Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8661622Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8662089Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8662197Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8662397Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8662555Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8662598Z FAILED [11.5166s] [ 16%] 2025-12-04T12:42:03.8662600Z 2025-12-04T12:42:03.8662657Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8662858Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8662906Z Traceback (most recent call last): 2025-12-04T12:42:03.8663070Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8663113Z self._join_processes(fn) 2025-12-04T12:42:03.8663287Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8663341Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8663537Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8663581Z raise RuntimeError(error) 2025-12-04T12:42:03.8663662Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8663709Z Traceback (most recent call last): 2025-12-04T12:42:03.8663873Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8663926Z getattr(self, test_name)() 2025-12-04T12:42:03.8664096Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8664131Z fn() 2025-12-04T12:42:03.8664287Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8664328Z method(*args, **kwargs) 2025-12-04T12:42:03.8664480Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8664519Z method(*args, **kwargs) 2025-12-04T12:42:03.8664672Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8664712Z with policy(): 2025-12-04T12:42:03.8664863Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8664906Z raise RuntimeError(msg) 2025-12-04T12:42:03.8665348Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8665351Z 2025-12-04T12:42:03.8665428Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8665777Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8665779Z 2025-12-04T12:42:03.8665869Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8665871Z 2025-12-04T12:42:03.8665873Z 2025-12-04T12:42:03.8665950Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8666037Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8666316Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-0d50af9fa3a75953.xml - 2025-12-04T12:42:03.8666378Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8669067Z FAILED [11.5166s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8669115Z Traceback (most recent call last): 2025-12-04T12:42:03.8669282Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8669326Z getattr(self, test_name)() 2025-12-04T12:42:03.8669489Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8669523Z fn() 2025-12-04T12:42:03.8669675Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8669734Z method(*args, **kwargs) 2025-12-04T12:42:03.8669886Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8669925Z method(*args, **kwargs) 2025-12-04T12:42:03.8670078Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8670130Z with policy(): 2025-12-04T12:42:03.8670282Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8670338Z raise RuntimeError(msg) 2025-12-04T12:42:03.8670781Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8670784Z 2025-12-04T12:42:03.8670860Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8671210Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8671213Z 2025-12-04T12:42:03.8671303Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8671368Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8671432Z ======================= 1 failed, 9 deselected in 11.65s ======================= 2025-12-04T12:42:03.8671469Z Got exit code 1 2025-12-04T12:42:03.8671509Z Retrying single test... 2025-12-04T12:42:03.8671782Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a2335104fd924581.xml 2025-12-04T12:42:03.8671841Z ============================= test session starts ============================== 2025-12-04T12:42:03.8671957Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8671998Z cachedir: .pytest_cache 2025-12-04T12:42:03.8672156Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8672203Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8672244Z configfile: pytest.ini 2025-12-04T12:42:03.8672407Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8672803Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8672855Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8673216Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8673276Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8673335Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8673673Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8673719Z Running 1 items in this shard 2025-12-04T12:42:03.8673721Z 2025-12-04T12:42:03.8674149Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:38:52.237000 473063 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 473132 2025-12-04T12:42:03.8674304Z I1204 12:38:52.238000 473063 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 473133 2025-12-04T12:42:03.8674466Z I1204 12:38:52.238000 473063 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 473134 2025-12-04T12:42:03.8674628Z I1204 12:38:52.239000 473063 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 473135 2025-12-04T12:42:03.8675309Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8675354Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8676028Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8676073Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8676743Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8676786Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8677452Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8677494Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8678007Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8678056Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8678569Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8678618Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8679127Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8679187Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8679670Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8679731Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8680408Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8680452Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8681125Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8681167Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8681834Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8681878Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8682366Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8682426Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8682931Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8682993Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8683676Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8683717Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8684204Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8684278Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8684516Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8684559Z local_shape = tensor.shape 2025-12-04T12:42:03.8684794Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8684831Z tensor.shape, 2025-12-04T12:42:03.8685065Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8685108Z local_shape = tensor.shape 2025-12-04T12:42:03.8685341Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8685380Z tensor.dtype, 2025-12-04T12:42:03.8685612Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8685649Z tensor.shape, 2025-12-04T12:42:03.8685882Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8685919Z tensor.dtype, 2025-12-04T12:42:03.8686405Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8686465Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8686697Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8686740Z local_shape = tensor.shape 2025-12-04T12:42:03.8686971Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8687007Z tensor.shape, 2025-12-04T12:42:03.8687249Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8687285Z tensor.dtype, 2025-12-04T12:42:03.8687517Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8687559Z local_shape = tensor.shape 2025-12-04T12:42:03.8687790Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8687837Z tensor.shape, 2025-12-04T12:42:03.8688069Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8688104Z tensor.dtype, 2025-12-04T12:42:03.8688278Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8688447Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8688746Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8688892Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8689173Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8689291Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8689559Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8689702Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8689971Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8690112Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8690380Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8690512Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8690785Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8690925Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8691503Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1107296256 and is now 2850029568. 2025-12-04T12:42:03.8691613Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8691808Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8692288Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8692397Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8692602Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8692759Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8692902Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8693064Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8693349Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8693495Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8693777Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8693894Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8694162Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8694303Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8694572Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8694713Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8694980Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8695114Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8695383Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8695529Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8696105Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T12:42:03.8696214Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8696404Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8696878Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8696987Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8697200Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8697372Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8697504Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8697657Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8697938Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8698084Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8698403Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8698520Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8698792Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8698933Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8699201Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8699343Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8699612Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8699745Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8700017Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8700177Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8700744Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8700853Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8701057Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8701523Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8701644Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8701861Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8702018Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8702152Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8702306Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8702589Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8702738Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8703021Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8703137Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8703408Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8703552Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8703821Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8703964Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8704233Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8704362Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8704643Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8704787Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8705363Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8705471Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8705667Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8706133Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8706261Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8706463Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8706625Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8706669Z FAILED [8.5134s] [100%] 2025-12-04T12:42:03.8706671Z 2025-12-04T12:42:03.8706730Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8706924Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8706974Z Traceback (most recent call last): 2025-12-04T12:42:03.8707141Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8707185Z self._join_processes(fn) 2025-12-04T12:42:03.8707360Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8707414Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8707597Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8707642Z raise RuntimeError(error) 2025-12-04T12:42:03.8707726Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8707775Z Traceback (most recent call last): 2025-12-04T12:42:03.8707941Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8707987Z getattr(self, test_name)() 2025-12-04T12:42:03.8708188Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8708223Z fn() 2025-12-04T12:42:03.8708377Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8708417Z method(*args, **kwargs) 2025-12-04T12:42:03.8708570Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8708610Z method(*args, **kwargs) 2025-12-04T12:42:03.8708775Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8708817Z with policy(): 2025-12-04T12:42:03.8708968Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8709015Z raise RuntimeError(msg) 2025-12-04T12:42:03.8709469Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1107296256 and is now 2850029568. 2025-12-04T12:42:03.8709472Z 2025-12-04T12:42:03.8709551Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8709900Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8709916Z 2025-12-04T12:42:03.8710009Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8710023Z 2025-12-04T12:42:03.8710025Z 2025-12-04T12:42:03.8710104Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8710191Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8710466Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a2335104fd924581.xml - 2025-12-04T12:42:03.8710528Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8710885Z FAILED [8.5134s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8710932Z Traceback (most recent call last): 2025-12-04T12:42:03.8711104Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8711147Z getattr(self, test_name)() 2025-12-04T12:42:03.8711311Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8711345Z fn() 2025-12-04T12:42:03.8711498Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8711540Z method(*args, **kwargs) 2025-12-04T12:42:03.8711693Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8711732Z method(*args, **kwargs) 2025-12-04T12:42:03.8711883Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8711921Z with policy(): 2025-12-04T12:42:03.8712075Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8712115Z raise RuntimeError(msg) 2025-12-04T12:42:03.8712564Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1107296256 and is now 2850029568. 2025-12-04T12:42:03.8712567Z 2025-12-04T12:42:03.8712643Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8713002Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8713006Z 2025-12-04T12:42:03.8713095Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8713159Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8713221Z ======================= 1 failed, 14 deselected in 8.65s ======================= 2025-12-04T12:42:03.8713258Z Got exit code 1 2025-12-04T12:42:03.8713301Z Retrying single test... 2025-12-04T12:42:03.8713537Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-548e425cbf16424e.xml 2025-12-04T12:42:03.8713596Z ============================= test session starts ============================== 2025-12-04T12:42:03.8713709Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8713767Z cachedir: .pytest_cache 2025-12-04T12:42:03.8713926Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8713981Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8714023Z configfile: pytest.ini 2025-12-04T12:42:03.8714186Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8714547Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8714599Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8714948Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8715008Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8715067Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8715402Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8715449Z Running 1 items in this shard 2025-12-04T12:42:03.8715453Z 2025-12-04T12:42:03.8715870Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:39:03.209000 473465 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 473534 2025-12-04T12:42:03.8716025Z I1204 12:39:03.210000 473465 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 473535 2025-12-04T12:42:03.8716177Z I1204 12:39:03.211000 473465 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 473536 2025-12-04T12:42:03.8716328Z I1204 12:39:03.211000 473465 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 473537 2025-12-04T12:42:03.8717021Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8717065Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8717738Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8717783Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8718513Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8718568Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8719257Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8719299Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8719799Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8719849Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8720344Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8720391Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8720880Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8720929Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8721414Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8721463Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8722152Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8722200Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8722880Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8722921Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8723632Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8723696Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8724184Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8724244Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8724911Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8724956Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8725444Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8725503Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8725990Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8726047Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8726529Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8726595Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8726832Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8726879Z local_shape = tensor.shape 2025-12-04T12:42:03.8727115Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8727158Z local_shape = tensor.shape 2025-12-04T12:42:03.8727399Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8727438Z tensor.shape, 2025-12-04T12:42:03.8727670Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8727711Z tensor.shape, 2025-12-04T12:42:03.8727941Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8728003Z local_shape = tensor.shape 2025-12-04T12:42:03.8728262Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8728301Z tensor.dtype, 2025-12-04T12:42:03.8728532Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8728571Z tensor.dtype, 2025-12-04T12:42:03.8728802Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8728840Z tensor.shape, 2025-12-04T12:42:03.8729074Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8729112Z tensor.dtype, 2025-12-04T12:42:03.8729345Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8729385Z local_shape = tensor.shape 2025-12-04T12:42:03.8729618Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8729654Z tensor.shape, 2025-12-04T12:42:03.8729885Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8729921Z tensor.dtype, 2025-12-04T12:42:03.8730057Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8730212Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8730496Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8730647Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8730927Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8731063Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8731335Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8731480Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8731768Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8731910Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8732179Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8732323Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8732606Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8732747Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8733314Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 950009856 and is now 2850029568. 2025-12-04T12:42:03.8733425Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8733617Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8734084Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8734194Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8734402Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8734561Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8734694Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8734846Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8735127Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8735272Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8735560Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8735677Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8735947Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8736099Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8736367Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8736509Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8736788Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8736927Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8737199Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8737343Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8737910Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664. 2025-12-04T12:42:03.8738019Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8738241Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8738708Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8738816Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8739021Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8739180Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8739310Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8739463Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8739761Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8739907Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8740186Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8740300Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8740584Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8740725Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8740995Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8741154Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8741434Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8741563Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8741833Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8741977Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8742540Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8742647Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8742837Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8746342Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8746464Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8746672Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8746831Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8746962Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8747143Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8747429Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8747577Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8747867Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8747985Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8748294Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8748456Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8748724Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8748880Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8749152Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8749279Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8749553Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8749694Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8750260Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8750370Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8750560Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8751033Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8751141Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8751345Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8751502Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8751568Z FAILED [8.8158s] [100%] 2025-12-04T12:42:03.8751571Z 2025-12-04T12:42:03.8751631Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8751823Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8751875Z Traceback (most recent call last): 2025-12-04T12:42:03.8752040Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8752086Z self._join_processes(fn) 2025-12-04T12:42:03.8752273Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8752332Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8752510Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8752556Z raise RuntimeError(error) 2025-12-04T12:42:03.8752649Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8752698Z Traceback (most recent call last): 2025-12-04T12:42:03.8752871Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8752914Z getattr(self, test_name)() 2025-12-04T12:42:03.8753072Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8753108Z fn() 2025-12-04T12:42:03.8753261Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8753305Z method(*args, **kwargs) 2025-12-04T12:42:03.8753457Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8753499Z method(*args, **kwargs) 2025-12-04T12:42:03.8753650Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8753689Z with policy(): 2025-12-04T12:42:03.8753844Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8753885Z raise RuntimeError(msg) 2025-12-04T12:42:03.8754331Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8754334Z 2025-12-04T12:42:03.8754412Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8754765Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8754769Z 2025-12-04T12:42:03.8754859Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8754861Z 2025-12-04T12:42:03.8754864Z 2025-12-04T12:42:03.8754940Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8755029Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8755303Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-548e425cbf16424e.xml - 2025-12-04T12:42:03.8755365Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8755736Z FAILED [8.8158s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8755785Z Traceback (most recent call last): 2025-12-04T12:42:03.8755953Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8755997Z getattr(self, test_name)() 2025-12-04T12:42:03.8756157Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8756206Z fn() 2025-12-04T12:42:03.8756357Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8756399Z method(*args, **kwargs) 2025-12-04T12:42:03.8756551Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8756601Z method(*args, **kwargs) 2025-12-04T12:42:03.8756750Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8756799Z with policy(): 2025-12-04T12:42:03.8756950Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8756991Z raise RuntimeError(msg) 2025-12-04T12:42:03.8757438Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568. 2025-12-04T12:42:03.8757441Z 2025-12-04T12:42:03.8757516Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8757867Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8757871Z 2025-12-04T12:42:03.8757959Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8758025Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8758086Z ======================= 1 failed, 14 deselected in 8.95s ======================= 2025-12-04T12:42:03.8758125Z Got exit code 1 2025-12-04T12:42:03.8758453Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8758583Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.8758814Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-51d2e4c0b094a25d.xml 2025-12-04T12:42:03.8758873Z ============================= test session starts ============================== 2025-12-04T12:42:03.8758989Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8759141Z cachedir: .pytest_cache 2025-12-04T12:42:03.8759303Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8759350Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8759391Z configfile: pytest.ini 2025-12-04T12:42:03.8759556Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8759938Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8759992Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8760342Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8760415Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8760474Z collected 15 items / 10 deselected / 5 selected 2025-12-04T12:42:03.8760526Z stepcurrent: skipping 10 already run items. 2025-12-04T12:42:03.8760571Z Running 5 items in this shard 2025-12-04T12:42:03.8760573Z 2025-12-04T12:42:03.8760994Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:39:14.540000 473867 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 473936 2025-12-04T12:42:03.8761179Z I1204 12:39:14.540000 473867 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 473937 2025-12-04T12:42:03.8761334Z I1204 12:39:14.541000 473867 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 473938 2025-12-04T12:42:03.8761486Z I1204 12:39:14.542000 473867 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 473939 2025-12-04T12:42:03.8762171Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8762217Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8762891Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8762934Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8763601Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8763644Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8764319Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8764370Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8764871Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8764922Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8765420Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8765469Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8765956Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8766031Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8766517Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8766565Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8767242Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8767286Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8767955Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8767996Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8768716Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8768758Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8769264Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8769327Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8769999Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8770053Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8770541Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8770613Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8771109Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8771167Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8771649Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8771706Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8771943Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8771989Z local_shape = tensor.shape 2025-12-04T12:42:03.8772223Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8772262Z tensor.shape, 2025-12-04T12:42:03.8772494Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8772531Z tensor.dtype, 2025-12-04T12:42:03.8772764Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8772809Z local_shape = tensor.shape 2025-12-04T12:42:03.8773043Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8773079Z tensor.shape, 2025-12-04T12:42:03.8773310Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8773348Z tensor.dtype, 2025-12-04T12:42:03.8773578Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8773648Z local_shape = tensor.shape 2025-12-04T12:42:03.8773981Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8774019Z tensor.shape, 2025-12-04T12:42:03.8774252Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8774287Z tensor.dtype, 2025-12-04T12:42:03.8774530Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8774572Z local_shape = tensor.shape 2025-12-04T12:42:03.8774803Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8774840Z tensor.shape, 2025-12-04T12:42:03.8775070Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8775117Z tensor.dtype, 2025-12-04T12:42:03.8775266Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8775422Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8775709Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8775857Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8776137Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8776256Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8776527Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8776671Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8776939Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8777083Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8777352Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8777483Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8777757Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8777897Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8778518Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8778630Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8778820Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8779303Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8779412Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8779631Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8779805Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8779936Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8780089Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8780369Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8780516Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8780798Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8780916Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8781186Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8781327Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8781596Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8781737Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8782005Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8782134Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8782405Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8782556Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8783117Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1262485504 and is now 2826960896. 2025-12-04T12:42:03.8783235Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8783426Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8783900Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8784034Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8784236Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8784394Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8784524Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8784677Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8784956Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8785103Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8785381Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8785496Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8785766Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8785907Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8786176Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8786317Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8786588Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8786717Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8786997Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8787139Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8787712Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1113587712 and is now 2826960896. 2025-12-04T12:42:03.8787820Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8788010Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8788520Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8788642Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8788845Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8789002Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8789134Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8789287Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8789567Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8789713Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8789991Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8790106Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8790375Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8790519Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8790787Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8790927Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8791207Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8791336Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8791607Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8791749Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8792328Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8792435Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8792638Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8793114Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8793222Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8793424Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8793583Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8793626Z FAILED [8.5146s] [ 20%] 2025-12-04T12:42:03.8793629Z 2025-12-04T12:42:03.8793687Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8793880Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8793927Z Traceback (most recent call last): 2025-12-04T12:42:03.8794094Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8794138Z self._join_processes(fn) 2025-12-04T12:42:03.8794316Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8794371Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8794553Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8794597Z raise RuntimeError(error) 2025-12-04T12:42:03.8794680Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8794724Z Traceback (most recent call last): 2025-12-04T12:42:03.8794887Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8794929Z getattr(self, test_name)() 2025-12-04T12:42:03.8795089Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8795124Z fn() 2025-12-04T12:42:03.8795288Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8795330Z method(*args, **kwargs) 2025-12-04T12:42:03.8795485Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8795525Z method(*args, **kwargs) 2025-12-04T12:42:03.8795678Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8795716Z with policy(): 2025-12-04T12:42:03.8795870Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8795912Z raise RuntimeError(msg) 2025-12-04T12:42:03.8796365Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8796368Z 2025-12-04T12:42:03.8796455Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8796805Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8796817Z 2025-12-04T12:42:03.8796908Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8796910Z 2025-12-04T12:42:03.8796912Z 2025-12-04T12:42:03.8796990Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8797079Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8797355Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-51d2e4c0b094a25d.xml - 2025-12-04T12:42:03.8797417Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8797774Z FAILED [8.5146s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8797821Z Traceback (most recent call last): 2025-12-04T12:42:03.8797988Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8798032Z getattr(self, test_name)() 2025-12-04T12:42:03.8798231Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8798266Z fn() 2025-12-04T12:42:03.8798423Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8798464Z method(*args, **kwargs) 2025-12-04T12:42:03.8798617Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8798658Z method(*args, **kwargs) 2025-12-04T12:42:03.8798811Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8798847Z with policy(): 2025-12-04T12:42:03.8799002Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8799043Z raise RuntimeError(msg) 2025-12-04T12:42:03.8799503Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8799507Z 2025-12-04T12:42:03.8799583Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8799931Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8799933Z 2025-12-04T12:42:03.8800023Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8800106Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8800171Z ======================= 1 failed, 10 deselected in 8.66s ======================= 2025-12-04T12:42:03.8800208Z Got exit code 1 2025-12-04T12:42:03.8800250Z Retrying single test... 2025-12-04T12:42:03.8800479Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-8b9d0678f025b599.xml 2025-12-04T12:42:03.8800551Z ============================= test session starts ============================== 2025-12-04T12:42:03.8800677Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8800722Z cachedir: .pytest_cache 2025-12-04T12:42:03.8800883Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8800929Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8800972Z configfile: pytest.ini 2025-12-04T12:42:03.8801136Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8801498Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8801550Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8801900Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8801957Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8802015Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8802358Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8802404Z Running 1 items in this shard 2025-12-04T12:42:03.8802407Z 2025-12-04T12:42:03.8802823Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:39:25.702000 474269 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 474338 2025-12-04T12:42:03.8802982Z I1204 12:39:25.703000 474269 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 474339 2025-12-04T12:42:03.8803139Z I1204 12:39:25.703000 474269 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 474340 2025-12-04T12:42:03.8803290Z I1204 12:39:25.704000 474269 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 474341 2025-12-04T12:42:03.8803984Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8804030Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8804711Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8804755Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8805422Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8805485Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8806155Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8806198Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8806699Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8806748Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8807241Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8807287Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8807778Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8807827Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8808359Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8808426Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8809100Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8809144Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8809827Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8809880Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8810549Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8810605Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8811094Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8811155Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8811638Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8811697Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8812180Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8812238Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8812914Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8812956Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8813451Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8813509Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8813748Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8813792Z local_shape = tensor.shape 2025-12-04T12:42:03.8814038Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8814082Z local_shape = tensor.shape 2025-12-04T12:42:03.8814315Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8814353Z tensor.shape, 2025-12-04T12:42:03.8814595Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8814650Z local_shape = tensor.shape 2025-12-04T12:42:03.8814882Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8814920Z tensor.dtype, 2025-12-04T12:42:03.8815152Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8815189Z tensor.shape, 2025-12-04T12:42:03.8815420Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8815457Z tensor.dtype, 2025-12-04T12:42:03.8815688Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8815725Z tensor.shape, 2025-12-04T12:42:03.8815954Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8815993Z tensor.dtype, 2025-12-04T12:42:03.8816224Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8816264Z local_shape = tensor.shape 2025-12-04T12:42:03.8816495Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8816532Z tensor.shape, 2025-12-04T12:42:03.8816762Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8816798Z tensor.dtype, 2025-12-04T12:42:03.8816935Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8817090Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8817378Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8817543Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8817823Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8817942Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8818249Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8818407Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8818676Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8818816Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8819098Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8819241Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8819511Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8819652Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8820217Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1113587712 and is now 2826960896. 2025-12-04T12:42:03.8820327Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8820518Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8820989Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8821100Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8821305Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8821462Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8821593Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8821744Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8822035Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8822182Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8822460Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8822574Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8822853Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8822998Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8823263Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8823421Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8823687Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8823817Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8824088Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8824263Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8824823Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8824932Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8825122Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8825588Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8825696Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8825899Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8826056Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8826188Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8826353Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8826635Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8826782Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8827068Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8827182Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8827451Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8827603Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8827880Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8828019Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8828317Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8828446Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8828714Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8828856Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8829419Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8829527Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8829715Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8830179Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8830289Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8830490Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8830662Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8830794Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8830945Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8831224Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8831383Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8831663Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8831777Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8832059Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8832221Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8832490Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8832629Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8832896Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8833024Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8833295Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8833435Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8833995Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8834104Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8834295Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8834767Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8834874Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8835084Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8835242Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8835282Z FAILED [8.7146s] [100%] 2025-12-04T12:42:03.8835286Z 2025-12-04T12:42:03.8835341Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8835534Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8835592Z Traceback (most recent call last): 2025-12-04T12:42:03.8835755Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8835798Z self._join_processes(fn) 2025-12-04T12:42:03.8835971Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8836035Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8836212Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8836264Z raise RuntimeError(error) 2025-12-04T12:42:03.8836345Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8836390Z Traceback (most recent call last): 2025-12-04T12:42:03.8836553Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8836595Z getattr(self, test_name)() 2025-12-04T12:42:03.8836753Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8836787Z fn() 2025-12-04T12:42:03.8836941Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8836983Z method(*args, **kwargs) 2025-12-04T12:42:03.8837134Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8837174Z method(*args, **kwargs) 2025-12-04T12:42:03.8837324Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8837360Z with policy(): 2025-12-04T12:42:03.8837513Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8837553Z raise RuntimeError(msg) 2025-12-04T12:42:03.8837994Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8837997Z 2025-12-04T12:42:03.8838072Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8838459Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8838461Z 2025-12-04T12:42:03.8838551Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8838554Z 2025-12-04T12:42:03.8838556Z 2025-12-04T12:42:03.8838630Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8838718Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8839002Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-8b9d0678f025b599.xml - 2025-12-04T12:42:03.8839063Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8839423Z FAILED [8.7146s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.8839469Z Traceback (most recent call last): 2025-12-04T12:42:03.8839649Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8839691Z getattr(self, test_name)() 2025-12-04T12:42:03.8839852Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8839886Z fn() 2025-12-04T12:42:03.8840037Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8840092Z method(*args, **kwargs) 2025-12-04T12:42:03.8840257Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8840296Z method(*args, **kwargs) 2025-12-04T12:42:03.8840446Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8840482Z with policy(): 2025-12-04T12:42:03.8840634Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8840674Z raise RuntimeError(msg) 2025-12-04T12:42:03.8841118Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8841121Z 2025-12-04T12:42:03.8841198Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8841547Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8841550Z 2025-12-04T12:42:03.8841639Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8841701Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8841764Z ======================= 1 failed, 14 deselected in 8.85s ======================= 2025-12-04T12:42:03.8841801Z Got exit code 1 2025-12-04T12:42:03.8841842Z Retrying single test... 2025-12-04T12:42:03.8842069Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-58cd89e09e93ed0b.xml 2025-12-04T12:42:03.8842129Z ============================= test session starts ============================== 2025-12-04T12:42:03.8842241Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8842284Z cachedir: .pytest_cache 2025-12-04T12:42:03.8842441Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8842487Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8842528Z configfile: pytest.ini 2025-12-04T12:42:03.8842692Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8843060Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8843112Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8843459Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8843515Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8843582Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8843928Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8843973Z Running 1 items in this shard 2025-12-04T12:42:03.8843986Z 2025-12-04T12:42:03.8844402Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:39:36.935000 474671 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 474740 2025-12-04T12:42:03.8844568Z I1204 12:39:36.936000 474671 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 474741 2025-12-04T12:42:03.8844721Z I1204 12:39:36.937000 474671 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 474742 2025-12-04T12:42:03.8844870Z I1204 12:39:36.938000 474671 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 474743 2025-12-04T12:42:03.8845552Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8845597Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8846271Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8846314Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8846985Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8847027Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8847706Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8847749Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8848283Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8848345Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8848834Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8848900Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8849390Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8849450Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8849937Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8849983Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8850656Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8850699Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8851366Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8851408Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8852079Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8852119Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8852621Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8852682Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8853174Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8853233Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8853714Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8853792Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8854469Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8854511Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8854998Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8855055Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8855291Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8855334Z local_shape = tensor.shape 2025-12-04T12:42:03.8855569Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8855607Z tensor.shape, 2025-12-04T12:42:03.8855838Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8855875Z tensor.dtype, 2025-12-04T12:42:03.8856104Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8856149Z local_shape = tensor.shape 2025-12-04T12:42:03.8856378Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8856420Z local_shape = tensor.shape 2025-12-04T12:42:03.8856650Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8856686Z tensor.shape, 2025-12-04T12:42:03.8856927Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8856965Z tensor.shape, 2025-12-04T12:42:03.8857196Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8857233Z tensor.dtype, 2025-12-04T12:42:03.8857463Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8857499Z tensor.dtype, 2025-12-04T12:42:03.8857740Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8857783Z local_shape = tensor.shape 2025-12-04T12:42:03.8858014Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8858060Z tensor.shape, 2025-12-04T12:42:03.8858333Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8858384Z tensor.dtype, 2025-12-04T12:42:03.8858520Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8858676Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8858959Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8859107Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8859386Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8859503Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8859776Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8859917Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8860185Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8860327Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8860594Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8860724Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8860994Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8861147Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8861710Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8861821Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8862024Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8862492Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8862615Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8862830Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8862988Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8863120Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8863271Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8863550Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8863696Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8863973Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8864088Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8864358Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8864501Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8864771Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8864913Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8865179Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8865307Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8865587Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8865728Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8866304Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8866412Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8866603Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8867080Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8867197Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8867400Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8867557Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8867688Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8867842Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8868124Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8868305Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8868587Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8868701Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8868969Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8869110Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8869379Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8869520Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8869800Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8869928Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8870198Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8870340Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8870915Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896. 2025-12-04T12:42:03.8871023Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8871225Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8871702Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8871810Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8872012Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8872169Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8872298Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8872451Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8872731Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8872876Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8873155Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8873270Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8873538Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8873678Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8873946Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8874095Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8874365Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8874496Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8874796Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8874946Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8875504Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8875631Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8875820Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8876291Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8876399Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8876601Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8876760Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8876799Z FAILED [8.6136s] [100%] 2025-12-04T12:42:03.8876802Z 2025-12-04T12:42:03.8876859Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8877051Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.8877099Z Traceback (most recent call last): 2025-12-04T12:42:03.8877263Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8877306Z self._join_processes(fn) 2025-12-04T12:42:03.8877480Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8877533Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8877713Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8877757Z raise RuntimeError(error) 2025-12-04T12:42:03.8877837Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8877882Z Traceback (most recent call last): 2025-12-04T12:42:03.8878045Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8878087Z getattr(self, test_name)() 2025-12-04T12:42:03.8878304Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8878339Z fn() 2025-12-04T12:42:03.8878492Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8878535Z method(*args, **kwargs) 2025-12-04T12:42:03.8878689Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8878729Z method(*args, **kwargs) 2025-12-04T12:42:03.8878880Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8878917Z with policy(): 2025-12-04T12:42:03.8879083Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8879123Z raise RuntimeError(msg) 2025-12-04T12:42:03.8879567Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8879594Z 2025-12-04T12:42:03.8879670Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8880017Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8880019Z 2025-12-04T12:42:03.8880107Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8880110Z 2025-12-04T12:42:03.8880167Z Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8880212Z Traceback (most recent call last): 2025-12-04T12:42:03.8880375Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8880419Z getattr(self, test_name)() 2025-12-04T12:42:03.8880578Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8880614Z fn() 2025-12-04T12:42:03.8880765Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8880806Z method(*args, **kwargs) 2025-12-04T12:42:03.8880958Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8880998Z method(*args, **kwargs) 2025-12-04T12:42:03.8881147Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8881183Z with policy(): 2025-12-04T12:42:03.8881334Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8881376Z raise RuntimeError(msg) 2025-12-04T12:42:03.8881816Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8881819Z 2025-12-04T12:42:03.8881892Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8882239Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8882242Z 2025-12-04T12:42:03.8882353Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8882356Z 2025-12-04T12:42:03.8882360Z 2025-12-04T12:42:03.8882435Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8882525Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8882797Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-58cd89e09e93ed0b.xml - 2025-12-04T12:42:03.8882858Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8883227Z FAILED [8.6136s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8883276Z Traceback (most recent call last): 2025-12-04T12:42:03.8883440Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8883494Z getattr(self, test_name)() 2025-12-04T12:42:03.8883664Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8883699Z fn() 2025-12-04T12:42:03.8883850Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8883890Z method(*args, **kwargs) 2025-12-04T12:42:03.8884040Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8884080Z method(*args, **kwargs) 2025-12-04T12:42:03.8884228Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8884266Z with policy(): 2025-12-04T12:42:03.8884416Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8884458Z raise RuntimeError(msg) 2025-12-04T12:42:03.8884899Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8884901Z 2025-12-04T12:42:03.8884975Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8885321Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8885324Z 2025-12-04T12:42:03.8885409Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8885412Z 2025-12-04T12:42:03.8885470Z Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.8885515Z Traceback (most recent call last): 2025-12-04T12:42:03.8885679Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8885720Z getattr(self, test_name)() 2025-12-04T12:42:03.8885879Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8885914Z fn() 2025-12-04T12:42:03.8886063Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8886103Z method(*args, **kwargs) 2025-12-04T12:42:03.8886262Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8886302Z method(*args, **kwargs) 2025-12-04T12:42:03.8886451Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8886492Z with policy(): 2025-12-04T12:42:03.8886641Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8886682Z raise RuntimeError(msg) 2025-12-04T12:42:03.8887128Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8887131Z 2025-12-04T12:42:03.8887207Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8887552Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8887577Z 2025-12-04T12:42:03.8887662Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8887727Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8887789Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.8887826Z Got exit code 1 2025-12-04T12:42:03.8888121Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.8888284Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.8888512Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ba49d39c2b2d16df.xml 2025-12-04T12:42:03.8888572Z ============================= test session starts ============================== 2025-12-04T12:42:03.8888684Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8888725Z cachedir: .pytest_cache 2025-12-04T12:42:03.8888882Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8888929Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8888969Z configfile: pytest.ini 2025-12-04T12:42:03.8889133Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8889494Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8889547Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8889894Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8889952Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8890008Z collected 15 items / 11 deselected / 4 selected 2025-12-04T12:42:03.8890061Z stepcurrent: skipping 11 already run items. 2025-12-04T12:42:03.8890106Z Running 4 items in this shard 2025-12-04T12:42:03.8890108Z 2025-12-04T12:42:03.8890545Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:39:48.074000 475073 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 475142 2025-12-04T12:42:03.8890702Z I1204 12:39:48.075000 475073 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 475143 2025-12-04T12:42:03.8890853Z I1204 12:39:48.076000 475073 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 475144 2025-12-04T12:42:03.8891016Z I1204 12:39:48.077000 475073 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 475145 2025-12-04T12:42:03.8891702Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8891769Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8892443Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8892484Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8893151Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8893195Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8893868Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8893910Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8894407Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8894458Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8894950Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8894996Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8895496Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8895543Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8896040Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8896086Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8896764Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8896831Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8897499Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8897541Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8898243Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8898284Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8898774Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8898833Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8899321Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8899379Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8900065Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8900108Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8900592Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8900662Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8901200Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8901271Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8901508Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8901564Z local_shape = tensor.shape 2025-12-04T12:42:03.8901800Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8901838Z tensor.shape, 2025-12-04T12:42:03.8902072Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8902108Z tensor.dtype, 2025-12-04T12:42:03.8902341Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8902383Z local_shape = tensor.shape 2025-12-04T12:42:03.8902615Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8902650Z tensor.shape, 2025-12-04T12:42:03.8902880Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8902918Z tensor.dtype, 2025-12-04T12:42:03.8903148Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8903192Z local_shape = tensor.shape 2025-12-04T12:42:03.8903421Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8903458Z tensor.shape, 2025-12-04T12:42:03.8903688Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8903724Z tensor.dtype, 2025-12-04T12:42:03.8903954Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8903998Z local_shape = tensor.shape 2025-12-04T12:42:03.8904230Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8904278Z tensor.shape, 2025-12-04T12:42:03.8904508Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8904547Z tensor.dtype, 2025-12-04T12:42:03.8904684Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8904841Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8905139Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8905286Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8905566Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8905694Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8905975Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8906117Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8906387Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8906527Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8906796Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8906927Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8907199Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8907342Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8907909Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8908021Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8908243Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8908724Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8908833Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8909037Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8909196Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8909326Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8909492Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8909777Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8909924Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8910216Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8910344Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8910613Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8910754Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8911023Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8911161Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8911431Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8911560Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8911830Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8911972Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8912534Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8912645Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8912833Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8913310Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8913419Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8913621Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8913788Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8913917Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8914069Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8914347Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8914527Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8914805Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8914921Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8915194Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8915333Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8915602Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8915741Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8916012Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8916140Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8916410Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8916551Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8917109Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8917216Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8917415Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8917884Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8917992Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8918237Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8918396Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8918526Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8918690Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8919061Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8919207Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8919485Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8919600Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8919868Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8920008Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8920276Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8920414Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8920682Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8920809Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8921082Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8921222Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8921792Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8921900Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8922088Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8922564Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8922671Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8922873Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8923030Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8923080Z FAILED [8.5171s] [ 25%] 2025-12-04T12:42:03.8923094Z 2025-12-04T12:42:03.8923151Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8923342Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8923390Z Traceback (most recent call last): 2025-12-04T12:42:03.8923554Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8923597Z self._join_processes(fn) 2025-12-04T12:42:03.8923769Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8923824Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8924002Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8924047Z raise RuntimeError(error) 2025-12-04T12:42:03.8924125Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8924172Z Traceback (most recent call last): 2025-12-04T12:42:03.8924332Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8924374Z getattr(self, test_name)() 2025-12-04T12:42:03.8924535Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8924568Z fn() 2025-12-04T12:42:03.8924722Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8924761Z method(*args, **kwargs) 2025-12-04T12:42:03.8924912Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8924951Z method(*args, **kwargs) 2025-12-04T12:42:03.8925102Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8925139Z with policy(): 2025-12-04T12:42:03.8925305Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8925358Z raise RuntimeError(msg) 2025-12-04T12:42:03.8925817Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8925820Z 2025-12-04T12:42:03.8925897Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8926242Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8926246Z 2025-12-04T12:42:03.8926334Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8926336Z 2025-12-04T12:42:03.8926338Z 2025-12-04T12:42:03.8926423Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8926512Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8926784Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ba49d39c2b2d16df.xml - 2025-12-04T12:42:03.8926846Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8927212Z FAILED [8.5171s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.8927268Z Traceback (most recent call last): 2025-12-04T12:42:03.8927433Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8927476Z getattr(self, test_name)() 2025-12-04T12:42:03.8927635Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8927671Z fn() 2025-12-04T12:42:03.8927822Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8927864Z method(*args, **kwargs) 2025-12-04T12:42:03.8928016Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8928056Z method(*args, **kwargs) 2025-12-04T12:42:03.8928240Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8928277Z with policy(): 2025-12-04T12:42:03.8928429Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8928471Z raise RuntimeError(msg) 2025-12-04T12:42:03.8928913Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8928916Z 2025-12-04T12:42:03.8928990Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8929334Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8929336Z 2025-12-04T12:42:03.8929423Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8929486Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8929547Z ======================= 1 failed, 11 deselected in 8.66s ======================= 2025-12-04T12:42:03.8929584Z Got exit code 1 2025-12-04T12:42:03.8929623Z Retrying single test... 2025-12-04T12:42:03.8929861Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-1732150cd52e220b.xml 2025-12-04T12:42:03.8929921Z ============================= test session starts ============================== 2025-12-04T12:42:03.8930034Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8930075Z cachedir: .pytest_cache 2025-12-04T12:42:03.8930234Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8930281Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8930340Z configfile: pytest.ini 2025-12-04T12:42:03.8930504Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8930864Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8930930Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8931288Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8931346Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8931402Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8931740Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8931785Z Running 1 items in this shard 2025-12-04T12:42:03.8931788Z 2025-12-04T12:42:03.8932202Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:39:59.401000 475475 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 475544 2025-12-04T12:42:03.8932362Z I1204 12:39:59.402000 475475 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 475545 2025-12-04T12:42:03.8932514Z I1204 12:39:59.403000 475475 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 475546 2025-12-04T12:42:03.8932667Z I1204 12:39:59.403000 475475 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 475547 2025-12-04T12:42:03.8933347Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8933392Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8934062Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8934105Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8934785Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8934829Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8935510Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8935552Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8936057Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8936117Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8936611Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8936658Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8937147Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8937194Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8937681Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8937728Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8938434Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8938478Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8939167Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8939210Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8939889Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8939931Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8940418Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8940490Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8940986Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8941046Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8941719Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8941763Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8942246Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8942303Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8942783Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8942840Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8943076Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8943119Z local_shape = tensor.shape 2025-12-04T12:42:03.8943354Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8943396Z local_shape = tensor.shape 2025-12-04T12:42:03.8943638Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8943675Z tensor.shape, 2025-12-04T12:42:03.8943908Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8943950Z local_shape = tensor.shape 2025-12-04T12:42:03.8944182Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8944218Z tensor.shape, 2025-12-04T12:42:03.8944461Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8944499Z tensor.dtype, 2025-12-04T12:42:03.8944730Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8944766Z tensor.dtype, 2025-12-04T12:42:03.8945006Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8945053Z tensor.shape, 2025-12-04T12:42:03.8945283Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8945319Z tensor.dtype, 2025-12-04T12:42:03.8945549Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8945591Z local_shape = tensor.shape 2025-12-04T12:42:03.8945821Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8945858Z tensor.shape, 2025-12-04T12:42:03.8946088Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8946125Z tensor.dtype, 2025-12-04T12:42:03.8946261Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8946416Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8946701Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8946850Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8947128Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8947245Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8947515Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8947656Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8947940Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8948079Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8948381Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8948512Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8948798Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8948941Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8949504Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896. 2025-12-04T12:42:03.8949638Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8949826Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8950295Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8950404Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8950607Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8950764Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8950895Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8951047Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8951326Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8951472Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8951752Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8951866Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8952135Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8952290Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8952558Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8952697Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8952964Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8953102Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8953376Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8953517Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8954083Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8954204Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8954392Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8954859Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8954968Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8955169Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8955327Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.8955456Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8955610Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8955889Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8956036Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8956313Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8956427Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8956703Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8956844Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8957112Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8957251Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8957527Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8957655Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8957924Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8958085Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8958686Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8958794Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8958980Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8959446Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8959554Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8959755Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8959912Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8960041Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8960193Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8960472Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8960618Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8960896Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8961025Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8961295Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8961434Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8961717Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8961856Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8962124Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8962264Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8962551Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8962693Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8963251Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.8963359Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8963548Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8964017Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8964122Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8964324Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8964481Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.8964521Z FAILED [8.6144s] [100%] 2025-12-04T12:42:03.8964524Z 2025-12-04T12:42:03.8964580Z =================================== FAILURES =================================== 2025-12-04T12:42:03.8964770Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.8964818Z Traceback (most recent call last): 2025-12-04T12:42:03.8964982Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.8965026Z self._join_processes(fn) 2025-12-04T12:42:03.8965208Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.8965263Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.8965440Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.8965485Z raise RuntimeError(error) 2025-12-04T12:42:03.8965565Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8965611Z Traceback (most recent call last): 2025-12-04T12:42:03.8965772Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8965824Z getattr(self, test_name)() 2025-12-04T12:42:03.8965983Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8966017Z fn() 2025-12-04T12:42:03.8966170Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8966222Z method(*args, **kwargs) 2025-12-04T12:42:03.8966373Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8966424Z method(*args, **kwargs) 2025-12-04T12:42:03.8966575Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8966612Z with policy(): 2025-12-04T12:42:03.8966766Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8966806Z raise RuntimeError(msg) 2025-12-04T12:42:03.8967248Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896. 2025-12-04T12:42:03.8967251Z 2025-12-04T12:42:03.8967326Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8967672Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8967675Z 2025-12-04T12:42:03.8967762Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8967765Z 2025-12-04T12:42:03.8967768Z 2025-12-04T12:42:03.8967843Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.8967932Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.8968247Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-1732150cd52e220b.xml - 2025-12-04T12:42:03.8968311Z =========================== short test summary info ============================ 2025-12-04T12:42:03.8968668Z FAILED [8.6144s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.8968715Z Traceback (most recent call last): 2025-12-04T12:42:03.8968881Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8968925Z getattr(self, test_name)() 2025-12-04T12:42:03.8969084Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8969135Z fn() 2025-12-04T12:42:03.8969287Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8969329Z method(*args, **kwargs) 2025-12-04T12:42:03.8969479Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8969520Z method(*args, **kwargs) 2025-12-04T12:42:03.8969670Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8969707Z with policy(): 2025-12-04T12:42:03.8969871Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8969912Z raise RuntimeError(msg) 2025-12-04T12:42:03.8970355Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896. 2025-12-04T12:42:03.8970370Z 2025-12-04T12:42:03.8970459Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8970807Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8970810Z 2025-12-04T12:42:03.8970897Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8970961Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.8971021Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.8971060Z Got exit code 1 2025-12-04T12:42:03.8971100Z Retrying single test... 2025-12-04T12:42:03.8971325Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-05ca25794532c849.xml 2025-12-04T12:42:03.8971384Z ============================= test session starts ============================== 2025-12-04T12:42:03.8971496Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.8971537Z cachedir: .pytest_cache 2025-12-04T12:42:03.8971694Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.8971742Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.8971782Z configfile: pytest.ini 2025-12-04T12:42:03.8971946Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.8972305Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8972357Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.8972701Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.8972759Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.8972816Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.8973165Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8973209Z Running 1 items in this shard 2025-12-04T12:42:03.8973213Z 2025-12-04T12:42:03.8973628Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:40:10.680000 475877 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 475946 2025-12-04T12:42:03.8973785Z I1204 12:40:10.681000 475877 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 475947 2025-12-04T12:42:03.8973947Z I1204 12:40:10.682000 475877 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 475948 2025-12-04T12:42:03.8974097Z I1204 12:40:10.683000 475877 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 475949 2025-12-04T12:42:03.8974777Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8974843Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8975516Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8975558Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8976264Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8976306Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8976977Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8977020Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8977515Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8977566Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8978066Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8978115Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8978630Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8978676Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8979183Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8979229Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.8979925Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8979982Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8980648Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8980690Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8981361Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8981403Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8981894Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8981954Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8982438Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8982494Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8983177Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.8983221Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.8983716Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8983773Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8984258Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.8984334Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:42:03.8984571Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8984613Z local_shape = tensor.shape 2025-12-04T12:42:03.8984849Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8984886Z tensor.shape, 2025-12-04T12:42:03.8985118Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8985155Z tensor.dtype, 2025-12-04T12:42:03.8985386Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8985429Z local_shape = tensor.shape 2025-12-04T12:42:03.8985660Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8985696Z tensor.shape, 2025-12-04T12:42:03.8985928Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8985963Z tensor.dtype, 2025-12-04T12:42:03.8986195Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8986238Z local_shape = tensor.shape 2025-12-04T12:42:03.8986471Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8986508Z tensor.shape, 2025-12-04T12:42:03.8986739Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8986774Z tensor.dtype, 2025-12-04T12:42:03.8987006Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8987048Z local_shape = tensor.shape 2025-12-04T12:42:03.8987287Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8987325Z tensor.shape, 2025-12-04T12:42:03.8987555Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:42:03.8987593Z tensor.dtype, 2025-12-04T12:42:03.8987728Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8987894Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8988216Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8988365Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8988658Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8988790Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8989063Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8989205Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8989476Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8989616Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8989888Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8990016Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8990287Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8990429Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8990990Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1254096896 and is now 2826960896. 2025-12-04T12:42:03.8991101Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8991292Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8991772Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8991882Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8992086Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8992245Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.8992387Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8992541Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8992820Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8992977Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8993264Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8993380Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8993649Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8993790Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8994063Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8994202Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8994478Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.8994604Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.8994875Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.8995016Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.8995576Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536. 2025-12-04T12:42:03.8995684Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8995884Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.8996349Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.8996459Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.8996679Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.8996838Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.8996967Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.8997119Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.8997406Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.8997564Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.8997841Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.8997956Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.8998260Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8998402Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.8998673Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.8998813Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9000854Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9000991Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9001265Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9001408Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9001972Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.9002107Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9002296Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9002761Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9002883Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9003086Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9003244Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.9003389Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9003556Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9003841Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9003991Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9004270Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9004386Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9004655Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9004798Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9005068Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9005207Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9005475Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9005602Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9005873Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9006013Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9006582Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896. 2025-12-04T12:42:03.9006691Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9006882Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9007358Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9007465Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9007667Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9007834Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.9007887Z FAILED [8.6141s] [100%] 2025-12-04T12:42:03.9007889Z 2025-12-04T12:42:03.9007948Z =================================== FAILURES =================================== 2025-12-04T12:42:03.9008140Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.9008226Z Traceback (most recent call last): 2025-12-04T12:42:03.9008392Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.9008437Z self._join_processes(fn) 2025-12-04T12:42:03.9008610Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.9008666Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.9008844Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.9008888Z raise RuntimeError(error) 2025-12-04T12:42:03.9008970Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.9009016Z Traceback (most recent call last): 2025-12-04T12:42:03.9009180Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9009224Z getattr(self, test_name)() 2025-12-04T12:42:03.9009385Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9009420Z fn() 2025-12-04T12:42:03.9009572Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9009615Z method(*args, **kwargs) 2025-12-04T12:42:03.9009765Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9009805Z method(*args, **kwargs) 2025-12-04T12:42:03.9009955Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9009992Z with policy(): 2025-12-04T12:42:03.9010144Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9010185Z raise RuntimeError(msg) 2025-12-04T12:42:03.9010641Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1254096896 and is now 2826960896. 2025-12-04T12:42:03.9010645Z 2025-12-04T12:42:03.9010721Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9011067Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9011069Z 2025-12-04T12:42:03.9011170Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9011172Z 2025-12-04T12:42:03.9011174Z 2025-12-04T12:42:03.9011252Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.9011342Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.9011617Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-05ca25794532c849.xml - 2025-12-04T12:42:03.9011704Z =========================== short test summary info ============================ 2025-12-04T12:42:03.9012061Z FAILED [8.6141s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.9012108Z Traceback (most recent call last): 2025-12-04T12:42:03.9012272Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9012314Z getattr(self, test_name)() 2025-12-04T12:42:03.9012473Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9012509Z fn() 2025-12-04T12:42:03.9012659Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9012701Z method(*args, **kwargs) 2025-12-04T12:42:03.9012851Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9012890Z method(*args, **kwargs) 2025-12-04T12:42:03.9013040Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9013079Z with policy(): 2025-12-04T12:42:03.9013230Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9013270Z raise RuntimeError(msg) 2025-12-04T12:42:03.9013714Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1254096896 and is now 2826960896. 2025-12-04T12:42:03.9013718Z 2025-12-04T12:42:03.9013792Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9014139Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9014142Z 2025-12-04T12:42:03.9014228Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9014292Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.9014370Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.9014408Z Got exit code 1 2025-12-04T12:42:03.9014699Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9014831Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.9015059Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a4b4f2efd2b27f6d.xml 2025-12-04T12:42:03.9015129Z ============================= test session starts ============================== 2025-12-04T12:42:03.9015243Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.9015283Z cachedir: .pytest_cache 2025-12-04T12:42:03.9015443Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.9015500Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.9015541Z configfile: pytest.ini 2025-12-04T12:42:03.9015706Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.9016076Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9016127Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.9016475Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9016532Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.9016590Z collected 15 items / 12 deselected / 3 selected 2025-12-04T12:42:03.9016642Z stepcurrent: skipping 12 already run items. 2025-12-04T12:42:03.9016687Z Running 3 items in this shard 2025-12-04T12:42:03.9016690Z 2025-12-04T12:42:03.9017063Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda I1204 12:40:21.894000 476279 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 476348 2025-12-04T12:42:03.9017219Z I1204 12:40:21.895000 476279 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 476349 2025-12-04T12:42:03.9017371Z I1204 12:40:21.895000 476279 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 476350 2025-12-04T12:42:03.9017521Z I1204 12:40:21.896000 476279 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 476351 2025-12-04T12:42:03.9018225Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9018269Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9018956Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9019000Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9019683Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9019727Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9020392Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9020467Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9020968Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9021016Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9021507Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9021554Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9022041Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9022088Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9022575Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9022622Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9022758Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9022914Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9023196Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9023345Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9023635Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9023753Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9024024Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9024173Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9024442Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9024582Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9024862Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9025005Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9025276Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9025416Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9025938Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T12:42:03.9026050Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9026239Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9026703Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9026813Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9027018Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9027177Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.9027307Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9027461Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9027739Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9027897Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9028204Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9028321Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9028604Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9028744Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9029013Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9029164Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9029453Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9029580Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9029851Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9029992Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9030506Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9030615Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9030803Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9031219Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9031327Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9031533Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9031690Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.9031820Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9031972Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9032266Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9032413Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9032691Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9032806Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9033086Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9033229Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9033508Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9033656Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9033924Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9034051Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9034324Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9034466Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9034978Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1101004800 and is now 2587885568. 2025-12-04T12:42:03.9035085Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9035273Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9035688Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9035795Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9035998Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9036156Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.9036284Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9036447Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9036728Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9036876Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9037162Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9037277Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9037545Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9037698Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9037977Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9038115Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9038418Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9038546Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9038815Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9038956Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9039467Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9039575Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9039766Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9040183Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9040289Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9040491Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9040646Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.9040687Z FAILED [8.6143s] [ 33%] 2025-12-04T12:42:03.9040703Z 2025-12-04T12:42:03.9040759Z =================================== FAILURES =================================== 2025-12-04T12:42:03.9040905Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.9040953Z Traceback (most recent call last): 2025-12-04T12:42:03.9041115Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.9041160Z self._join_processes(fn) 2025-12-04T12:42:03.9041332Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.9041399Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.9041576Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.9041620Z raise RuntimeError(error) 2025-12-04T12:42:03.9041700Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.9041760Z Traceback (most recent call last): 2025-12-04T12:42:03.9041921Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9041976Z getattr(self, test_name)() 2025-12-04T12:42:03.9042134Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9042170Z fn() 2025-12-04T12:42:03.9042321Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9042362Z method(*args, **kwargs) 2025-12-04T12:42:03.9042513Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9042553Z method(*args, **kwargs) 2025-12-04T12:42:03.9042703Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9042742Z with policy(): 2025-12-04T12:42:03.9042892Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9042934Z raise RuntimeError(msg) 2025-12-04T12:42:03.9043327Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T12:42:03.9043331Z 2025-12-04T12:42:03.9043405Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9043700Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9043703Z 2025-12-04T12:42:03.9043791Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9043793Z 2025-12-04T12:42:03.9043796Z 2025-12-04T12:42:03.9043872Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.9043959Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.9044234Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a4b4f2efd2b27f6d.xml - 2025-12-04T12:42:03.9044294Z =========================== short test summary info ============================ 2025-12-04T12:42:03.9044617Z FAILED [8.6143s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:42:03.9044665Z Traceback (most recent call last): 2025-12-04T12:42:03.9044830Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9044874Z getattr(self, test_name)() 2025-12-04T12:42:03.9045033Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9045070Z fn() 2025-12-04T12:42:03.9045220Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9045273Z method(*args, **kwargs) 2025-12-04T12:42:03.9045426Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9045467Z method(*args, **kwargs) 2025-12-04T12:42:03.9045617Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9045673Z with policy(): 2025-12-04T12:42:03.9045824Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9045876Z raise RuntimeError(msg) 2025-12-04T12:42:03.9046269Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T12:42:03.9046272Z 2025-12-04T12:42:03.9046346Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9046644Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9046646Z 2025-12-04T12:42:03.9046734Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9046798Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.9046860Z ======================= 1 failed, 12 deselected in 8.75s ======================= 2025-12-04T12:42:03.9046901Z Got exit code 1 2025-12-04T12:42:03.9046940Z Retrying single test... 2025-12-04T12:42:03.9047171Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f24772aeb8490283.xml 2025-12-04T12:42:03.9047229Z ============================= test session starts ============================== 2025-12-04T12:42:03.9047344Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.9047384Z cachedir: .pytest_cache 2025-12-04T12:42:03.9047544Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.9047591Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.9047633Z configfile: pytest.ini 2025-12-04T12:42:03.9047800Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.9048198Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9048249Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.9048611Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9048670Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.9048728Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.9049019Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9049063Z Running 1 items in this shard 2025-12-04T12:42:03.9049066Z 2025-12-04T12:42:03.9049450Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda I1204 12:40:33.196000 476681 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 476750 2025-12-04T12:42:03.9049606Z I1204 12:40:33.197000 476681 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 476751 2025-12-04T12:42:03.9049759Z I1204 12:40:33.197000 476681 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 476752 2025-12-04T12:42:03.9049921Z I1204 12:40:33.198000 476681 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 476753 2025-12-04T12:42:03.9050618Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9050663Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9051334Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9051379Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9052046Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9052087Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9052757Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9052799Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9053300Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9053362Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9053854Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9053903Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9054401Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9054451Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9054936Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9055003Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9055139Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9055294Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9055580Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9055727Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9056007Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9056124Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9056395Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9056537Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9056806Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9056947Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9057215Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9057347Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9057615Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9057770Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9058319Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568. 2025-12-04T12:42:03.9058428Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9058634Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9059051Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9059173Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9059390Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9059550Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.9059680Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9059836Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9060117Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9060263Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9060543Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9060658Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9060928Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9061069Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9061343Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9061484Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9061752Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9061880Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9062168Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9062312Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9062847Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9062955Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9063145Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9063558Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9063687Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9063889Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9064048Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.9064177Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9064329Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9064611Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9064758Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9065037Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9065151Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9065419Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9065559Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9065827Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9065965Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9066234Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9066370Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9066640Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9066782Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9067302Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T12:42:03.9067411Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9067600Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9068024Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9068142Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9068397Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9068553Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.9068682Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9068835Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9069114Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9069259Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9069538Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9069652Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9069923Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9070063Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9070332Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9070471Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9070755Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9070884Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9071153Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9071294Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9071818Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9071925Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9072128Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9072555Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9072663Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9072865Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9073024Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.9073063Z FAILED [8.5135s] [100%] 2025-12-04T12:42:03.9073065Z 2025-12-04T12:42:03.9073121Z =================================== FAILURES =================================== 2025-12-04T12:42:03.9073265Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.9073311Z Traceback (most recent call last): 2025-12-04T12:42:03.9073474Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.9073518Z self._join_processes(fn) 2025-12-04T12:42:03.9073689Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.9073743Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.9073920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.9073965Z raise RuntimeError(error) 2025-12-04T12:42:03.9074044Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.9074090Z Traceback (most recent call last): 2025-12-04T12:42:03.9074250Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9074293Z getattr(self, test_name)() 2025-12-04T12:42:03.9074453Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9074486Z fn() 2025-12-04T12:42:03.9074638Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9074678Z method(*args, **kwargs) 2025-12-04T12:42:03.9074838Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9074878Z method(*args, **kwargs) 2025-12-04T12:42:03.9075029Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9075066Z with policy(): 2025-12-04T12:42:03.9075217Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9075257Z raise RuntimeError(msg) 2025-12-04T12:42:03.9075665Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568. 2025-12-04T12:42:03.9075667Z 2025-12-04T12:42:03.9075743Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9076041Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9076063Z 2025-12-04T12:42:03.9076152Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9076154Z 2025-12-04T12:42:03.9076156Z 2025-12-04T12:42:03.9076229Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.9076318Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.9076589Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f24772aeb8490283.xml - 2025-12-04T12:42:03.9076651Z =========================== short test summary info ============================ 2025-12-04T12:42:03.9076990Z FAILED [8.5135s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.9077039Z Traceback (most recent call last): 2025-12-04T12:42:03.9077204Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9077246Z getattr(self, test_name)() 2025-12-04T12:42:03.9077406Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9077441Z fn() 2025-12-04T12:42:03.9077592Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9077632Z method(*args, **kwargs) 2025-12-04T12:42:03.9077784Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9077825Z method(*args, **kwargs) 2025-12-04T12:42:03.9077975Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9078013Z with policy(): 2025-12-04T12:42:03.9078197Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9078237Z raise RuntimeError(msg) 2025-12-04T12:42:03.9078636Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568. 2025-12-04T12:42:03.9078638Z 2025-12-04T12:42:03.9078735Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9079033Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9079037Z 2025-12-04T12:42:03.9079123Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9079186Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.9079247Z ======================= 1 failed, 14 deselected in 8.68s ======================= 2025-12-04T12:42:03.9079284Z Got exit code 1 2025-12-04T12:42:03.9079336Z Retrying single test... 2025-12-04T12:42:03.9079563Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f73ec1e65b79e9d8.xml 2025-12-04T12:42:03.9079622Z ============================= test session starts ============================== 2025-12-04T12:42:03.9079732Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.9079789Z cachedir: .pytest_cache 2025-12-04T12:42:03.9079948Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.9080006Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.9080045Z configfile: pytest.ini 2025-12-04T12:42:03.9080209Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.9080569Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9080620Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.9080968Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9081028Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.9081083Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.9081373Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9081416Z Running 1 items in this shard 2025-12-04T12:42:03.9081420Z 2025-12-04T12:42:03.9081794Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda I1204 12:40:44.420000 477083 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 477152 2025-12-04T12:42:03.9081951Z I1204 12:40:44.421000 477083 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 477153 2025-12-04T12:42:03.9082102Z I1204 12:40:44.422000 477083 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 477154 2025-12-04T12:42:03.9082254Z I1204 12:40:44.423000 477083 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 477155 2025-12-04T12:42:03.9082949Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9082993Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9083672Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9083715Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9084393Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9084445Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9085115Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9085166Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9085663Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9085712Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9086201Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9086250Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9086737Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9086784Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9087270Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9087317Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9087452Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9087606Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9087898Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9088047Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9088362Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9088491Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9088760Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9088902Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9089184Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9089336Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9089605Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9089734Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9090004Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9090144Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9090665Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T12:42:03.9090773Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9090964Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9091379Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9091488Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9091692Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9091849Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.9091991Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9092144Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9092424Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9092571Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9092863Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9092978Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9093247Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9093398Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9093684Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9093824Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9094091Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9094219Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9094490Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9094631Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9095143Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9095251Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9095440Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9095857Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9095964Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9096168Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9096334Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.9096465Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9096617Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9096897Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9097051Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9097328Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9097442Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9097720Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9097871Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9098139Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9098317Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9098587Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9098715Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9098985Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9099126Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9099639Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568. 2025-12-04T12:42:03.9099745Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9099933Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9100347Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9100454Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9100668Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9100826Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.9100957Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9101110Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9101400Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9101545Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9101822Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9101948Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9102231Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9102370Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9102639Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9102779Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9103045Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9103174Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9103442Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9103583Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9104101Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9104208Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9104396Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9104809Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9104927Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9105130Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9105288Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.9105327Z FAILED [8.8147s] [100%] 2025-12-04T12:42:03.9105330Z 2025-12-04T12:42:03.9105385Z =================================== FAILURES =================================== 2025-12-04T12:42:03.9105529Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda _ 2025-12-04T12:42:03.9105585Z Traceback (most recent call last): 2025-12-04T12:42:03.9105749Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.9105792Z self._join_processes(fn) 2025-12-04T12:42:03.9105967Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.9106028Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.9106206Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.9106260Z raise RuntimeError(error) 2025-12-04T12:42:03.9106340Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.9106385Z Traceback (most recent call last): 2025-12-04T12:42:03.9106548Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9106589Z getattr(self, test_name)() 2025-12-04T12:42:03.9106751Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9106784Z fn() 2025-12-04T12:42:03.9106937Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9106978Z method(*args, **kwargs) 2025-12-04T12:42:03.9107131Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9107171Z method(*args, **kwargs) 2025-12-04T12:42:03.9107321Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9107357Z with policy(): 2025-12-04T12:42:03.9107510Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9107550Z raise RuntimeError(msg) 2025-12-04T12:42:03.9107945Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9107950Z 2025-12-04T12:42:03.9108026Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9108358Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9108360Z 2025-12-04T12:42:03.9108449Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9108451Z 2025-12-04T12:42:03.9108453Z 2025-12-04T12:42:03.9108528Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.9108615Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.9108910Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f73ec1e65b79e9d8.xml - 2025-12-04T12:42:03.9108971Z =========================== short test summary info ============================ 2025-12-04T12:42:03.9109282Z FAILED [8.8147s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.9109329Z Traceback (most recent call last): 2025-12-04T12:42:03.9109495Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9109551Z getattr(self, test_name)() 2025-12-04T12:42:03.9109712Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9109746Z fn() 2025-12-04T12:42:03.9109898Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9109955Z method(*args, **kwargs) 2025-12-04T12:42:03.9110106Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9110156Z method(*args, **kwargs) 2025-12-04T12:42:03.9110306Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9110342Z with policy(): 2025-12-04T12:42:03.9110495Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9110535Z raise RuntimeError(msg) 2025-12-04T12:42:03.9110929Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9110932Z 2025-12-04T12:42:03.9111006Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9111300Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9111304Z 2025-12-04T12:42:03.9111391Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9111453Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.9111515Z ======================= 1 failed, 14 deselected in 8.95s ======================= 2025-12-04T12:42:03.9111551Z Got exit code 1 2025-12-04T12:42:03.9111795Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda 2025-12-04T12:42:03.9111923Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.9112148Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-889307325d7c8e37.xml 2025-12-04T12:42:03.9112206Z ============================= test session starts ============================== 2025-12-04T12:42:03.9112319Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.9112359Z cachedir: .pytest_cache 2025-12-04T12:42:03.9112522Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.9112567Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.9112608Z configfile: pytest.ini 2025-12-04T12:42:03.9112780Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.9113142Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9113193Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.9113548Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9113606Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.9113662Z collected 15 items / 13 deselected / 2 selected 2025-12-04T12:42:03.9113714Z stepcurrent: skipping 13 already run items. 2025-12-04T12:42:03.9113758Z Running 2 items in this shard 2025-12-04T12:42:03.9113760Z 2025-12-04T12:42:03.9114128Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda I1204 12:40:55.801000 477485 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 477554 2025-12-04T12:42:03.9114302Z I1204 12:40:55.802000 477485 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 477555 2025-12-04T12:42:03.9114454Z I1204 12:40:55.802000 477485 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 477556 2025-12-04T12:42:03.9114605Z I1204 12:40:55.803000 477485 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 477557 2025-12-04T12:42:03.9115281Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9115327Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9115998Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9116041Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9116707Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9116749Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9117426Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9117466Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9117962Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9118011Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9118540Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9118589Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9119075Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9119145Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9119630Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9119676Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9119811Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9119967Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9120252Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9120399Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9120678Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9120794Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9121064Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9121205Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9121473Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9121612Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9121890Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9122021Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9122293Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9122433Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9122956Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 950009856 and is now 2587885568. 2025-12-04T12:42:03.9123073Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9123264Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9123688Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9123797Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9124001Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9124158Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.9124290Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9124442Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9124726Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9124872Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9125149Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9125265Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9125534Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9125673Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9125942Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9126093Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9126360Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9126490Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9126759Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9126917Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9127428Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9127584Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9127784Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9128260Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9128370Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9128573Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9128730Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.9128861Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9129013Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9129295Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9129440Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9129717Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9129832Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9130108Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9130247Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9130536Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9130676Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9130943Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9131072Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9131354Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9131495Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9132002Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T12:42:03.9132135Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9132325Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9132737Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9132844Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9133048Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9133207Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.9133336Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9133490Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9133769Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9133915Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9134192Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9134307Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9134576Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9134715Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9134992Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9135132Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9135400Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9135537Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9135809Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9135950Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9136466Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9136584Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9136772Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9137185Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9137292Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9137494Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9137650Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.9137690Z FAILED [8.6153s] [ 50%] 2025-12-04T12:42:03.9137692Z 2025-12-04T12:42:03.9137748Z =================================== FAILURES =================================== 2025-12-04T12:42:03.9137889Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.9137937Z Traceback (most recent call last): 2025-12-04T12:42:03.9138100Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.9138172Z self._join_processes(fn) 2025-12-04T12:42:03.9138348Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.9138401Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.9138582Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.9138626Z raise RuntimeError(error) 2025-12-04T12:42:03.9138708Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.9138753Z Traceback (most recent call last): 2025-12-04T12:42:03.9138931Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9138974Z getattr(self, test_name)() 2025-12-04T12:42:03.9139133Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9139168Z fn() 2025-12-04T12:42:03.9139319Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9139359Z method(*args, **kwargs) 2025-12-04T12:42:03.9139511Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9139551Z method(*args, **kwargs) 2025-12-04T12:42:03.9139713Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9139750Z with policy(): 2025-12-04T12:42:03.9139902Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9139943Z raise RuntimeError(msg) 2025-12-04T12:42:03.9140349Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 950009856 and is now 2587885568. 2025-12-04T12:42:03.9140364Z 2025-12-04T12:42:03.9140440Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9140736Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9140738Z 2025-12-04T12:42:03.9140826Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9140828Z 2025-12-04T12:42:03.9140830Z 2025-12-04T12:42:03.9140904Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.9140992Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.9141261Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-889307325d7c8e37.xml - 2025-12-04T12:42:03.9141323Z =========================== short test summary info ============================ 2025-12-04T12:42:03.9141631Z FAILED [8.6153s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.9141678Z Traceback (most recent call last): 2025-12-04T12:42:03.9141843Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9141885Z getattr(self, test_name)() 2025-12-04T12:42:03.9142045Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9142081Z fn() 2025-12-04T12:42:03.9142232Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9142273Z method(*args, **kwargs) 2025-12-04T12:42:03.9142423Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9142462Z method(*args, **kwargs) 2025-12-04T12:42:03.9142613Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9142648Z with policy(): 2025-12-04T12:42:03.9142815Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9142856Z raise RuntimeError(msg) 2025-12-04T12:42:03.9143248Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 950009856 and is now 2587885568. 2025-12-04T12:42:03.9143251Z 2025-12-04T12:42:03.9143325Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9143629Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9143631Z 2025-12-04T12:42:03.9143718Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9143783Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.9143843Z ======================= 1 failed, 13 deselected in 8.78s ======================= 2025-12-04T12:42:03.9143892Z Got exit code 1 2025-12-04T12:42:03.9143931Z Retrying single test... 2025-12-04T12:42:03.9144158Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-47111fc25541c005.xml 2025-12-04T12:42:03.9144226Z ============================= test session starts ============================== 2025-12-04T12:42:03.9144339Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.9144380Z cachedir: .pytest_cache 2025-12-04T12:42:03.9144538Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.9144584Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.9144623Z configfile: pytest.ini 2025-12-04T12:42:03.9144787Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.9145144Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9145195Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.9145541Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9145598Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.9145652Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.9145943Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9145988Z Running 1 items in this shard 2025-12-04T12:42:03.9145990Z 2025-12-04T12:42:03.9146359Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda I1204 12:41:07.191000 477887 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 477956 2025-12-04T12:42:03.9146515Z I1204 12:41:07.192000 477887 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 477957 2025-12-04T12:42:03.9146667Z I1204 12:41:07.193000 477887 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 477958 2025-12-04T12:42:03.9146817Z I1204 12:41:07.193000 477887 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 477959 2025-12-04T12:42:03.9147504Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9147550Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9148268Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9148322Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9148992Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9149047Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9149714Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9149757Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9150253Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9150301Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9150791Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9150839Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9151328Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9151374Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9151871Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9151918Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9152053Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9152208Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9152500Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9152647Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9152925Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9153052Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9153334Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9153476Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9153749Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9153893Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9154161Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9154291Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9154562Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9154701Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9155217Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9155326Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9155515Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9155930Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9156057Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9156263Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9156421Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.9156552Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9156703Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9156993Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9157139Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9157416Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9157559Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9157828Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9157969Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9158264Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9158407Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9158678Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9158807Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9159078Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9159217Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9159729Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9159836Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9160027Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9160456Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9160563Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9160768Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9160926Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.9161055Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9161218Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9161497Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9161654Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9161930Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9162058Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9162326Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9162466Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9162733Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9162873Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9163142Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9163271Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9163540Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9163680Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9164190Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9164296Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9164485Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9164907Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9165015Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9165218Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9165374Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.9165514Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9165666Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9165945Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9166111Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9166386Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9166500Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9166769Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9166908Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9167176Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9167316Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9167582Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9167709Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9167980Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9168121Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9168674Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T12:42:03.9168780Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9169004Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9169416Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9169524Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9169725Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9169894Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.9169935Z FAILED [8.6147s] [100%] 2025-12-04T12:42:03.9169937Z 2025-12-04T12:42:03.9169993Z =================================== FAILURES =================================== 2025-12-04T12:42:03.9170133Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.9170193Z Traceback (most recent call last): 2025-12-04T12:42:03.9170375Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.9170418Z self._join_processes(fn) 2025-12-04T12:42:03.9170591Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.9170644Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.9170822Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.9170865Z raise RuntimeError(error) 2025-12-04T12:42:03.9170946Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.9170991Z Traceback (most recent call last): 2025-12-04T12:42:03.9171153Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9171195Z getattr(self, test_name)() 2025-12-04T12:42:03.9171355Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9171388Z fn() 2025-12-04T12:42:03.9171539Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9171580Z method(*args, **kwargs) 2025-12-04T12:42:03.9171732Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9171771Z method(*args, **kwargs) 2025-12-04T12:42:03.9171922Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9171959Z with policy(): 2025-12-04T12:42:03.9172112Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9172153Z raise RuntimeError(msg) 2025-12-04T12:42:03.9172550Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9172552Z 2025-12-04T12:42:03.9172628Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9172921Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9172934Z 2025-12-04T12:42:03.9173023Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9173026Z 2025-12-04T12:42:03.9173028Z 2025-12-04T12:42:03.9173102Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.9173190Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.9173460Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-47111fc25541c005.xml - 2025-12-04T12:42:03.9173531Z =========================== short test summary info ============================ 2025-12-04T12:42:03.9173839Z FAILED [8.6147s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.9173887Z Traceback (most recent call last): 2025-12-04T12:42:03.9174051Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9174109Z getattr(self, test_name)() 2025-12-04T12:42:03.9174283Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9174317Z fn() 2025-12-04T12:42:03.9174469Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9174508Z method(*args, **kwargs) 2025-12-04T12:42:03.9174660Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9174699Z method(*args, **kwargs) 2025-12-04T12:42:03.9174852Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9174889Z with policy(): 2025-12-04T12:42:03.9175041Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9175081Z raise RuntimeError(msg) 2025-12-04T12:42:03.9175474Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9175477Z 2025-12-04T12:42:03.9175552Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9175846Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9175849Z 2025-12-04T12:42:03.9175936Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9175999Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.9176061Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.9176098Z Got exit code 1 2025-12-04T12:42:03.9176138Z Retrying single test... 2025-12-04T12:42:03.9176362Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f6f712e096927ea2.xml 2025-12-04T12:42:03.9176421Z ============================= test session starts ============================== 2025-12-04T12:42:03.9176533Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.9176574Z cachedir: .pytest_cache 2025-12-04T12:42:03.9176743Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.9176790Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.9176830Z configfile: pytest.ini 2025-12-04T12:42:03.9176994Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.9177357Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9177406Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.9177762Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9177820Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.9177876Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.9178245Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9178306Z Running 1 items in this shard 2025-12-04T12:42:03.9178308Z 2025-12-04T12:42:03.9178675Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda I1204 12:41:18.316000 478289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 478358 2025-12-04T12:42:03.9178829Z I1204 12:41:18.317000 478289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 478359 2025-12-04T12:42:03.9178980Z I1204 12:41:18.318000 478289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 478360 2025-12-04T12:42:03.9179131Z I1204 12:41:18.318000 478289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 478361 2025-12-04T12:42:03.9179814Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9179858Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9180527Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9180572Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9181238Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9181279Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9181959Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9182003Z FSDP.set_state_dict_type( 2025-12-04T12:42:03.9182518Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9182565Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9183055Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9183123Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9183610Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9183656Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9184141Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:42:03.9184189Z device = _get_pg_default_device(group) 2025-12-04T12:42:03.9184323Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9184477Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9184764Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9184913Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9185192Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9185310Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9185581Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9185720Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9186000Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9186141Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9186409Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9186537Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9186819Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9186961Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9187482Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 954204160 and is now 2587885568. 2025-12-04T12:42:03.9187614Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9187803Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9188244Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9188353Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9188558Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9188717Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.9188848Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9189000Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9189279Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9189427Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9189706Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9189821Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9190090Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9190246Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9190514Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9190655Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9190923Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9191169Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9191441Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9191580Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9192106Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208. 2025-12-04T12:42:03.9192233Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9192421Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9192834Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9192942Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9193145Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9193301Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.9193432Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9193585Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9193862Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9194010Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9194287Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9194404Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9194682Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9194823Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9195091Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9195232Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9195511Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9195638Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9195909Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9196058Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9196579Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9196687Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9196876Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9197289Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9197396Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9197598Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9197754Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.9197884Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9198035Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9198355Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9198502Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9198779Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9198894Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9199177Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9199318Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9199586Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9199737Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9200005Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9200132Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9200417Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9200572Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9201083Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9201191Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9201381Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9201796Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9201902Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9202104Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9202260Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.9202300Z FAILED [9.0144s] [100%] 2025-12-04T12:42:03.9202303Z 2025-12-04T12:42:03.9202358Z =================================== FAILURES =================================== 2025-12-04T12:42:03.9202500Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda _ 2025-12-04T12:42:03.9202545Z Traceback (most recent call last): 2025-12-04T12:42:03.9202709Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.9202752Z self._join_processes(fn) 2025-12-04T12:42:03.9202926Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.9202979Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.9203167Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.9203211Z raise RuntimeError(error) 2025-12-04T12:42:03.9203291Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.9203336Z Traceback (most recent call last): 2025-12-04T12:42:03.9203498Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9203542Z getattr(self, test_name)() 2025-12-04T12:42:03.9203701Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9203736Z fn() 2025-12-04T12:42:03.9203898Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9203939Z method(*args, **kwargs) 2025-12-04T12:42:03.9204090Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9204131Z method(*args, **kwargs) 2025-12-04T12:42:03.9204290Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9204328Z with policy(): 2025-12-04T12:42:03.9204490Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9204531Z raise RuntimeError(msg) 2025-12-04T12:42:03.9204924Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9204926Z 2025-12-04T12:42:03.9205001Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9205293Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9205297Z 2025-12-04T12:42:03.9205385Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9205388Z 2025-12-04T12:42:03.9205390Z 2025-12-04T12:42:03.9205466Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.9205552Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.9205825Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f6f712e096927ea2.xml - 2025-12-04T12:42:03.9205884Z =========================== short test summary info ============================ 2025-12-04T12:42:03.9206195Z FAILED [9.0144s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.9206242Z Traceback (most recent call last): 2025-12-04T12:42:03.9206405Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9206449Z getattr(self, test_name)() 2025-12-04T12:42:03.9206609Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9206643Z fn() 2025-12-04T12:42:03.9206795Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9206835Z method(*args, **kwargs) 2025-12-04T12:42:03.9206985Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9207041Z method(*args, **kwargs) 2025-12-04T12:42:03.9207190Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9207228Z with policy(): 2025-12-04T12:42:03.9207380Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9207421Z raise RuntimeError(msg) 2025-12-04T12:42:03.9207821Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9207824Z 2025-12-04T12:42:03.9207899Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9208217Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9208238Z 2025-12-04T12:42:03.9208327Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9208406Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.9208469Z ======================= 1 failed, 14 deselected in 9.15s ======================= 2025-12-04T12:42:03.9208507Z Got exit code 1 2025-12-04T12:42:03.9208754Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda 2025-12-04T12:42:03.9208882Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.9209106Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-97ab67582658c2cb.xml 2025-12-04T12:42:03.9209165Z ============================= test session starts ============================== 2025-12-04T12:42:03.9209277Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.9209319Z cachedir: .pytest_cache 2025-12-04T12:42:03.9209476Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.9209522Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.9209562Z configfile: pytest.ini 2025-12-04T12:42:03.9209725Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.9210081Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9210133Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.9210477Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9210536Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.9210593Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.9210645Z stepcurrent: skipping 14 already run items. 2025-12-04T12:42:03.9210688Z Running 1 items in this shard 2025-12-04T12:42:03.9210691Z 2025-12-04T12:42:03.9211040Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda I1204 12:41:29.866000 478691 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 478760 2025-12-04T12:42:03.9211196Z I1204 12:41:29.866000 478691 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 478761 2025-12-04T12:42:03.9211347Z I1204 12:41:29.867000 478691 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 478762 2025-12-04T12:42:03.9211498Z I1204 12:41:29.868000 478691 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 478763 2025-12-04T12:42:03.9212230Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9212325Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9213041Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9213145Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9213852Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9213943Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9214643Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9214732Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9214866Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9215022Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9215305Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9215452Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9215733Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9215860Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9216131Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9216273Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9216552Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9216692Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9216961Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9218866Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9219161Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9219303Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9219786Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1098907648 and is now 2587885568. 2025-12-04T12:42:03.9219898Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9220108Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9220488Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9220596Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9220799Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9220957Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.9221087Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9221242Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9221519Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9221666Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9221959Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9222077Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9222347Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9222489Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9222771Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9222910Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9223180Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9223391Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9223661Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9223802Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9224279Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9224388Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9224576Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9224957Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9225062Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9225267Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9225423Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.9225555Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9225708Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9225987Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9226134Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9226420Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9226536Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9226806Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9226955Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9227226Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9227366Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9227633Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9227789Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9228060Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9228233Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9228744Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664. 2025-12-04T12:42:03.9228854Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9229042Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9229417Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9229523Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9229726Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9229883Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.9230014Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9230164Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9230444Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9230608Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9230884Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9231000Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9231267Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9231419Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9231688Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9231827Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9232111Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9232251Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9232521Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9232660Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9233134Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9233242Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9233430Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9233808Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9233915Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9234117Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9234275Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.9234315Z FAILED [8.8128s] [100%] 2025-12-04T12:42:03.9234318Z 2025-12-04T12:42:03.9234372Z =================================== FAILURES =================================== 2025-12-04T12:42:03.9234482Z ___ TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda ____ 2025-12-04T12:42:03.9234528Z Traceback (most recent call last): 2025-12-04T12:42:03.9234691Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.9234746Z self._join_processes(fn) 2025-12-04T12:42:03.9234920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.9234974Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.9235154Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.9235198Z raise RuntimeError(error) 2025-12-04T12:42:03.9235278Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.9235323Z Traceback (most recent call last): 2025-12-04T12:42:03.9235494Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9235537Z getattr(self, test_name)() 2025-12-04T12:42:03.9235697Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9235732Z fn() 2025-12-04T12:42:03.9235882Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9235935Z method(*args, **kwargs) 2025-12-04T12:42:03.9236095Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9236135Z method(*args, **kwargs) 2025-12-04T12:42:03.9236283Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9236321Z with policy(): 2025-12-04T12:42:03.9236474Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9236515Z raise RuntimeError(msg) 2025-12-04T12:42:03.9236875Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9236879Z 2025-12-04T12:42:03.9236956Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9237216Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9237218Z 2025-12-04T12:42:03.9237306Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9237309Z 2025-12-04T12:42:03.9237369Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.9237414Z Traceback (most recent call last): 2025-12-04T12:42:03.9237576Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9237618Z getattr(self, test_name)() 2025-12-04T12:42:03.9237777Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9237812Z fn() 2025-12-04T12:42:03.9237963Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9238003Z method(*args, **kwargs) 2025-12-04T12:42:03.9238191Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9238229Z method(*args, **kwargs) 2025-12-04T12:42:03.9238379Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9238416Z with policy(): 2025-12-04T12:42:03.9238567Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9238621Z raise RuntimeError(msg) 2025-12-04T12:42:03.9238980Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1098907648 and is now 2587885568. 2025-12-04T12:42:03.9238984Z 2025-12-04T12:42:03.9239058Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9239316Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9239336Z 2025-12-04T12:42:03.9239424Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9239426Z 2025-12-04T12:42:03.9239428Z 2025-12-04T12:42:03.9239503Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.9239592Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.9239860Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-97ab67582658c2cb.xml - 2025-12-04T12:42:03.9239948Z =========================== short test summary info ============================ 2025-12-04T12:42:03.9240222Z FAILED [8.8128s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.9240269Z Traceback (most recent call last): 2025-12-04T12:42:03.9240431Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9240473Z getattr(self, test_name)() 2025-12-04T12:42:03.9240633Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9240667Z fn() 2025-12-04T12:42:03.9240817Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9240859Z method(*args, **kwargs) 2025-12-04T12:42:03.9241009Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9241047Z method(*args, **kwargs) 2025-12-04T12:42:03.9241196Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9241233Z with policy(): 2025-12-04T12:42:03.9241385Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9241424Z raise RuntimeError(msg) 2025-12-04T12:42:03.9241782Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9241786Z 2025-12-04T12:42:03.9241858Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9242113Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9242115Z 2025-12-04T12:42:03.9242201Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9242203Z 2025-12-04T12:42:03.9242263Z Process 3 exited with error code 10 and exception: 2025-12-04T12:42:03.9242307Z Traceback (most recent call last): 2025-12-04T12:42:03.9242479Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9242521Z getattr(self, test_name)() 2025-12-04T12:42:03.9244588Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9244631Z fn() 2025-12-04T12:42:03.9244784Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9244825Z method(*args, **kwargs) 2025-12-04T12:42:03.9244974Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9245038Z method(*args, **kwargs) 2025-12-04T12:42:03.9245187Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9245224Z with policy(): 2025-12-04T12:42:03.9245377Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9245419Z raise RuntimeError(msg) 2025-12-04T12:42:03.9245777Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1098907648 and is now 2587885568. 2025-12-04T12:42:03.9245805Z 2025-12-04T12:42:03.9245880Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9246350Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9246352Z 2025-12-04T12:42:03.9246440Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9246503Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.9246567Z ======================= 1 failed, 14 deselected in 8.95s ======================= 2025-12-04T12:42:03.9246603Z Got exit code 1 2025-12-04T12:42:03.9246644Z Retrying single test... 2025-12-04T12:42:03.9246982Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-166946d282ac9173.xml 2025-12-04T12:42:03.9247042Z ============================= test session starts ============================== 2025-12-04T12:42:03.9247157Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.9247197Z cachedir: .pytest_cache 2025-12-04T12:42:03.9247358Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.9247405Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.9247445Z configfile: pytest.ini 2025-12-04T12:42:03.9247610Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.9247973Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9248027Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.9248412Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9248469Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.9248525Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.9248802Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9248849Z Running 1 items in this shard 2025-12-04T12:42:03.9248852Z 2025-12-04T12:42:03.9249189Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda I1204 12:41:41.095000 479093 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 479162 2025-12-04T12:42:03.9249347Z I1204 12:41:41.096000 479093 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 479163 2025-12-04T12:42:03.9249516Z I1204 12:41:41.096000 479093 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 479164 2025-12-04T12:42:03.9249667Z I1204 12:41:41.097000 479093 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 479165 2025-12-04T12:42:03.9250389Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9250515Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9251229Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9251322Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9252029Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9252120Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9252829Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9252918Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9253056Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9253211Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9253507Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9253656Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9253936Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9254053Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9254334Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9254478Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9254747Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9254899Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9255177Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9255307Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9255578Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9255720Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9256205Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9256316Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9256508Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9256888Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9256999Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9257207Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9257364Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.9257494Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9257646Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9257943Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9258089Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9258399Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9258515Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9258795Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9258937Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9259205Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9259375Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9259642Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9259770Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9260043Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9260184Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9260661Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9260769Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9260960Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9261337Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9261446Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9261648Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9261807Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.9261937Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9262105Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9262386Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9262532Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9262808Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9262933Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9263203Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9263343Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9263620Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9263772Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9264041Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9264170Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9264441Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9264582Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9265058Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568. 2025-12-04T12:42:03.9265165Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9265356Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9265730Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9265840Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9266040Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9266198Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.9266327Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9266490Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9266768Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9266915Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9267203Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9267316Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9267584Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9267723Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9268011Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9268210Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9268477Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9268605Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9268874Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9269017Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9269489Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664. 2025-12-04T12:42:03.9269596Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9269784Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9270158Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9270267Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9270469Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9270627Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.9270666Z FAILED [8.6142s] [100%] 2025-12-04T12:42:03.9270682Z 2025-12-04T12:42:03.9270740Z =================================== FAILURES =================================== 2025-12-04T12:42:03.9270847Z ___ TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda ____ 2025-12-04T12:42:03.9270895Z Traceback (most recent call last): 2025-12-04T12:42:03.9271059Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.9271104Z self._join_processes(fn) 2025-12-04T12:42:03.9271278Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.9271344Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.9271523Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.9271566Z raise RuntimeError(error) 2025-12-04T12:42:03.9271646Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.9271691Z Traceback (most recent call last): 2025-12-04T12:42:03.9271853Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9271926Z getattr(self, test_name)() 2025-12-04T12:42:03.9272085Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9272119Z fn() 2025-12-04T12:42:03.9272271Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9272312Z method(*args, **kwargs) 2025-12-04T12:42:03.9272462Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9272501Z method(*args, **kwargs) 2025-12-04T12:42:03.9272653Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9272690Z with policy(): 2025-12-04T12:42:03.9272843Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9272884Z raise RuntimeError(msg) 2025-12-04T12:42:03.9273243Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9273246Z 2025-12-04T12:42:03.9273321Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9273577Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9273581Z 2025-12-04T12:42:03.9273668Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9273670Z 2025-12-04T12:42:03.9273672Z 2025-12-04T12:42:03.9273747Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.9273835Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.9274107Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-166946d282ac9173.xml - 2025-12-04T12:42:03.9274168Z =========================== short test summary info ============================ 2025-12-04T12:42:03.9274443Z FAILED [8.6142s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:42:03.9274489Z Traceback (most recent call last): 2025-12-04T12:42:03.9274664Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9274707Z getattr(self, test_name)() 2025-12-04T12:42:03.9274867Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9274902Z fn() 2025-12-04T12:42:03.9275053Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9275092Z method(*args, **kwargs) 2025-12-04T12:42:03.9275253Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9275292Z method(*args, **kwargs) 2025-12-04T12:42:03.9275441Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9275479Z with policy(): 2025-12-04T12:42:03.9275631Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9275683Z raise RuntimeError(msg) 2025-12-04T12:42:03.9276040Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9276054Z 2025-12-04T12:42:03.9276128Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9276384Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9276385Z 2025-12-04T12:42:03.9276471Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9276535Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.9276596Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.9276635Z Got exit code 1 2025-12-04T12:42:03.9276675Z Retrying single test... 2025-12-04T12:42:03.9276903Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-9a57974f2962ab4b.xml 2025-12-04T12:42:03.9276960Z ============================= test session starts ============================== 2025-12-04T12:42:03.9277073Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.9277114Z cachedir: .pytest_cache 2025-12-04T12:42:03.9277272Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.9277319Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.9277358Z configfile: pytest.ini 2025-12-04T12:42:03.9277522Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.9277880Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9277931Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.9278316Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9278376Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.9278448Z collected 15 items / 14 deselected / 1 selected 2025-12-04T12:42:03.9278699Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9278744Z Running 1 items in this shard 2025-12-04T12:42:03.9278748Z 2025-12-04T12:42:03.9279080Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda I1204 12:41:52.213000 479495 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 479564 2025-12-04T12:42:03.9279292Z I1204 12:41:52.214000 479495 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 479565 2025-12-04T12:42:03.9279443Z I1204 12:41:52.214000 479495 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 479566 2025-12-04T12:42:03.9279594Z I1204 12:41:52.215000 479495 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 479567 2025-12-04T12:42:03.9280314Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9280434Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9281149Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9281238Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9281942Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9282031Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9282735Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:42:03.9282825Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:42:03.9282959Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9283115Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9283406Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9283554Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9283833Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9283948Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9284228Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9284370Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9284637Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9284798Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9285068Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9285197Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9285467Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9285607Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9286084Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1254096896 and is now 2587885568. 2025-12-04T12:42:03.9286194Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9286382Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9286759Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9286869Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9287072Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9287229Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:42:03.9287358Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9287520Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9287798Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9287946Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9288442Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9288579Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9288849Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9288988Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9289276Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9289430Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9289697Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9289824Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9290093Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9290235Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9290711Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9290820Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9291008Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9291383Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9291490Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9291692Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9291850Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:42:03.9291978Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9292146Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9292423Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9292572Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9292867Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9292983Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9293252Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9293393Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9293690Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9293829Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9294098Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9294226Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9294497Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9294639Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9295114Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9295221Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9295409Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9295785Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9295891Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9296095Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9296251Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:42:03.9296392Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:42:03.9296545Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:42:03.9296826Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9296972Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:42:03.9297259Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9297374Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:42:03.9297641Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9297802Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9298070Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9298248Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:42:03.9298517Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9298644Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:42:03.9298915Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9299056Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:42:03.9299531Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664. 2025-12-04T12:42:03.9299643Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9299831Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9300209Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9300314Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:42:03.9300518Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9300687Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:42:03.9300729Z FAILED [8.6145s] [100%] 2025-12-04T12:42:03.9300731Z 2025-12-04T12:42:03.9300787Z =================================== FAILURES =================================== 2025-12-04T12:42:03.9300898Z ___ TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda ____ 2025-12-04T12:42:03.9300945Z Traceback (most recent call last): 2025-12-04T12:42:03.9301109Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:42:03.9301152Z self._join_processes(fn) 2025-12-04T12:42:03.9301340Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:42:03.9301393Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:42:03.9301574Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:42:03.9301618Z raise RuntimeError(error) 2025-12-04T12:42:03.9301698Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.9301757Z Traceback (most recent call last): 2025-12-04T12:42:03.9301918Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9301975Z getattr(self, test_name)() 2025-12-04T12:42:03.9302134Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9302170Z fn() 2025-12-04T12:42:03.9302322Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9302363Z method(*args, **kwargs) 2025-12-04T12:42:03.9302514Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9302555Z method(*args, **kwargs) 2025-12-04T12:42:03.9302705Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9302744Z with policy(): 2025-12-04T12:42:03.9302898Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9302939Z raise RuntimeError(msg) 2025-12-04T12:42:03.9303294Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9303297Z 2025-12-04T12:42:03.9303373Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9303629Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9303633Z 2025-12-04T12:42:03.9303721Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9303724Z 2025-12-04T12:42:03.9303727Z 2025-12-04T12:42:03.9303802Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:42:03.9303890Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:42:03.9304162Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-9a57974f2962ab4b.xml - 2025-12-04T12:42:03.9304221Z =========================== short test summary info ============================ 2025-12-04T12:42:03.9304507Z FAILED [8.6145s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:42:03.9304555Z Traceback (most recent call last): 2025-12-04T12:42:03.9304718Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:42:03.9304763Z getattr(self, test_name)() 2025-12-04T12:42:03.9304922Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:42:03.9304956Z fn() 2025-12-04T12:42:03.9305107Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9305165Z method(*args, **kwargs) 2025-12-04T12:42:03.9305316Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:42:03.9305356Z method(*args, **kwargs) 2025-12-04T12:42:03.9305504Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:42:03.9305542Z with policy(): 2025-12-04T12:42:03.9305693Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:42:03.9305756Z raise RuntimeError(msg) 2025-12-04T12:42:03.9306113Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568. 2025-12-04T12:42:03.9306116Z 2025-12-04T12:42:03.9306190Z To execute this test, run the following from the base repo dir: 2025-12-04T12:42:03.9306446Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9306449Z 2025-12-04T12:42:03.9306534Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:42:03.9306597Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:42:03.9306657Z ======================= 1 failed, 14 deselected in 8.75s ======================= 2025-12-04T12:42:03.9306697Z Got exit code 1 2025-12-04T12:42:03.9306903Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda 2025-12-04T12:42:03.9307031Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:42:03.9307257Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c23a818ffc04ad44.xml 2025-12-04T12:42:03.9307315Z ============================= test session starts ============================== 2025-12-04T12:42:03.9307427Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:42:03.9307469Z cachedir: .pytest_cache 2025-12-04T12:42:03.9307626Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:42:03.9307673Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:42:03.9307712Z configfile: pytest.ini 2025-12-04T12:42:03.9307876Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:42:03.9308274Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9308324Z class TestDummyModel(torch.nn.Module): 2025-12-04T12:42:03.9308683Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py) 2025-12-04T12:42:03.9308741Z class TestDummyModelUneven(torch.nn.Module): 2025-12-04T12:42:03.9308799Z collected 15 items / 15 deselected / 0 selected 2025-12-04T12:42:03.9308852Z stepcurrent: skipping 15 already run items. 2025-12-04T12:42:03.9308896Z Running 0 items in this shard 2025-12-04T12:42:03.9308898Z 2025-12-04T12:42:03.9309181Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c23a818ffc04ad44.xml - 2025-12-04T12:42:03.9309241Z ============================ 15 deselected in 0.01s ============================ 2025-12-04T12:42:03.9313004Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda'] 2025-12-04T12:42:03.9313039Z 2025-12-04T12:42:03.9313257Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 (test/test-reports/distributed.fsdp.test_fsdp_dtensor_state_dict_1.1_429921b2f227c24a_.log) 2025-12-04T12:42:03.9313260Z 2025-12-04T12:42:03.9313410Z Finished distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 ... [2025-12-04 12:42:03.732641][2291622.381822912], took 8.55min 2025-12-04T12:42:03.9313672Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:42:03.9313762Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:42:03.9313857Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:42:03.9313903Z Uploading artifacts took 0.00 seconds 2025-12-04T12:42:03.9313976Z distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 failed! 2025-12-04T12:42:03.9314101Z Running distributed/fsdp/test_fsdp_comm_hooks 1/1 ... [2025-12-04 12:42:03.735809][2291622.384993073] 2025-12-04T12:42:03.9314153Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:42:03.9314473Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm_hooks.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:42:03.735983] 2025-12-04T12:44:59.6648062Z 2025-12-04T12:44:59.6649101Z distributed/fsdp/test_fsdp_comm_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_comm_hooks_1.1_97288ec4bda6c925_.log 2025-12-04T12:44:59.6655216Z Running 28 items in this shard: test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_False_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_True_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_behavior_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_behavior_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_behavior_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_False_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_True_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_hybrid_strategy, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_non_root_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_non_root_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_non_root_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_submodules_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_submodules_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_submodules_sharding_strategy2 2025-12-04T12:44:59.6660825Z 2025-12-04T12:44:59.6660972Z Finished distributed/fsdp/test_fsdp_comm_hooks 1/1 ... [2025-12-04 12:44:59.664665][2291798.313846272], took 2.93min 2025-12-04T12:44:59.6669529Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:44:59.6688699Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:44:59.6692233Z Running distributed/fsdp/test_fsdp_hybrid_shard 1/1 ... [2025-12-04 12:44:59.669038][2291798.318221154] 2025-12-04T12:44:59.6692450Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:44:59.6693738Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_hybrid_shard.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:44:59.669267] 2025-12-04T12:46:00.8655994Z 2025-12-04T12:46:00.8657439Z distributed/fsdp/test_fsdp_hybrid_shard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_hybrid_shard_1.1_dbce83217519cbf5_.log 2025-12-04T12:46:00.8661047Z Running 6 items in this shard: test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_fsdp_hybrid_shard_basic_setup, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_fsdp_hybrid_shard_parity, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_hsdp_save_load_state_dict, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_hsdp_sync_module_state, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_invalid_pg_specification_raises, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_raises_manual_wrap_hybrid_shard_when_none_policy 2025-12-04T12:46:00.8663681Z 2025-12-04T12:46:00.8664103Z Finished distributed/fsdp/test_fsdp_hybrid_shard 1/1 ... [2025-12-04 12:46:00.865386][2291859.514565652], took 1.02min 2025-12-04T12:46:00.8672985Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:46:00.8690148Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:46:00.8693036Z Running distributed/_shard/test_sharder 1/1 ... [2025-12-04 12:46:00.869214][2291859.518397333] 2025-12-04T12:46:00.8693235Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:46:00.8695160Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/test_sharder.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:46:00.869412] 2025-12-04T12:46:12.7521034Z 2025-12-04T12:46:12.7522972Z distributed/_shard/test_sharder 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.test_sharder_1.1_1eaaa65535e8636d_.log 2025-12-04T12:46:12.7524290Z Running 2 items in this shard: test/distributed/_shard/test_sharder.py::TestCustomSharder::test_custom_sharder, test/distributed/_shard/test_sharder.py::TestCustomSharder::test_custom_sharder_errors 2025-12-04T12:46:12.7524990Z 2025-12-04T12:46:12.7525287Z Finished distributed/_shard/test_sharder 1/1 ... [2025-12-04 12:46:12.751805][2291871.400984945], took 0.20min 2025-12-04T12:46:12.7539996Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:46:12.7556734Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:46:12.7559993Z Running distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 ... [2025-12-04 12:46:12.755870][2291871.405052992] 2025-12-04T12:46:12.7560399Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:46:12.7561871Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_tensor_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:46:12.756058] 2025-12-04T12:46:37.6071370Z 2025-12-04T12:46:37.6072898Z distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_tensor_ops_1.1_1dd185e705cadcbb_.log 2025-12-04T12:46:37.6075838Z Running 5 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_clone, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_deep_copy, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_detach, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_inplace_copy, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_set_requires_grad 2025-12-04T12:46:37.6077858Z 2025-12-04T12:46:37.6078539Z Finished distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 ... [2025-12-04 12:46:37.606944][2291896.25612344], took 0.41min 2025-12-04T12:46:37.6089552Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:46:37.6106581Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:46:37.6110077Z Running distributed/_shard/sharding_plan/test_sharding_plan 1/1 ... [2025-12-04 12:46:37.610884][2291896.26006846] 2025-12-04T12:46:37.6110313Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:46:37.6112049Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharding_plan/test_sharding_plan.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:46:37.611083] 2025-12-04T12:46:53.8988834Z 2025-12-04T12:46:53.8990362Z distributed/_shard/sharding_plan/test_sharding_plan 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharding_plan.test_sharding_plan_1.1_5a261ebd1ee62caf_.log 2025-12-04T12:46:53.8992605Z Running 3 items in this shard: test/distributed/_shard/sharding_plan/test_sharding_plan.py::TestShardingPlan::test_custom_sharding_planner, test/distributed/_shard/sharding_plan/test_sharding_plan.py::TestShardingPlan::test_shard_module_sub_process_group, test/distributed/_shard/sharding_plan/test_sharding_plan.py::TestShardingPlan::test_sharding_plan_errors 2025-12-04T12:46:53.8993880Z 2025-12-04T12:46:53.8994629Z Finished distributed/_shard/sharding_plan/test_sharding_plan 1/1 ... [2025-12-04 12:46:53.898595][2291912.547774341], took 0.27min 2025-12-04T12:46:53.9008954Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:46:53.9026064Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:46:53.9029337Z Running distributed/fsdp/test_fsdp_comm 1/1 ... [2025-12-04 12:46:53.902767][2291912.551951177] 2025-12-04T12:46:53.9029706Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:46:53.9030947Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:46:53.902959] 2025-12-04T12:52:45.7327593Z 2025-12-04T12:52:45.7328947Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm 1/1 (test/test-reports/distributed.fsdp.test_fsdp_comm_1.1_3b36b42e6bf366b5_.log) 2025-12-04T12:52:45.7329882Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-bb216e1619e0039d.xml 2025-12-04T12:52:45.7331226Z ============================= test session starts ============================== 2025-12-04T12:52:45.7331824Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7332223Z cachedir: .pytest_cache 2025-12-04T12:52:45.7332696Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7333209Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7333458Z configfile: pytest.ini 2025-12-04T12:52:45.7333900Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7334337Z collecting ... collected 10 items 2025-12-04T12:52:45.7334599Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:52:45.7338266Z Running 10 items in this shard: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.7341920Z 2025-12-04T12:52:45.7342569Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda I1204 12:46:55.740000 495855 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 495924 2025-12-04T12:52:45.7343544Z I1204 12:46:55.741000 495855 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 495925 2025-12-04T12:52:45.7344163Z I1204 12:46:55.742000 495855 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 495926 2025-12-04T12:52:45.7344675Z I1204 12:46:55.742000 495855 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 495927 2025-12-04T12:52:45.7345438Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7346094Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7346751Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7347334Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7348122Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7349050Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7350342Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7351120Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7351723Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7352307Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7353070Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7353851Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7354326Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7354788Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7355390Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7356005Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7356271Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7356664Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7357274Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7357823Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7358411Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7376181Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7376948Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7377441Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7377944Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7378485Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7378953Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7379406Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7379866Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7380345Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7381068Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7381743Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7382100Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7382744Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7383303Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7383671Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7384091Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7384366Z dist init r=1, world=4 2025-12-04T12:52:45.7384580Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7384922Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7385415Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7385918Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7386401Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7386859Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7387315Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7387795Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7388313Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7388775Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7389242Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7389696Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7390159Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7390628Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7391334Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7392002Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7392357Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7392998Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7393552Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7393962Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7394382Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7394630Z dist init r=2, world=4 2025-12-04T12:52:45.7394840Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7395429Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7395918Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7396406Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7396891Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7397371Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7397807Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7398307Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7398774Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7399236Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7399699Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7400153Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7400614Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7401093Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7401808Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7402482Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7402842Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7403507Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7404074Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7404450Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7404887Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7405142Z dist init r=3, world=4 2025-12-04T12:52:45.7405356Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7405705Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7406203Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7406731Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7407222Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7407679Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7408133Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7408669Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7409149Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7409625Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7410100Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7410564Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7411031Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7411509Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7412225Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704. 2025-12-04T12:52:45.7412925Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7413287Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7413936Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7414512Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7414884Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7415312Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7415564Z dist init r=0, world=4 2025-12-04T12:52:45.7416005Z [rank0]:[W1204 12:47:03.488733552 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7416436Z FAILED [9.3139s] [ 10%] 2025-12-04T12:52:45.7416512Z 2025-12-04T12:52:45.7416574Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7416821Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda _ 2025-12-04T12:52:45.7417050Z Traceback (most recent call last): 2025-12-04T12:52:45.7417307Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7417562Z self._join_processes(fn) 2025-12-04T12:52:45.7417814Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7418084Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7418400Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7418670Z raise RuntimeError(error) 2025-12-04T12:52:45.7418829Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.7419003Z Traceback (most recent call last): 2025-12-04T12:52:45.7419252Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7419495Z getattr(self, test_name)() 2025-12-04T12:52:45.7419737Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7419981Z fn() 2025-12-04T12:52:45.7420192Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7420439Z method(*args, **kwargs) 2025-12-04T12:52:45.7420669Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7420901Z method(*args, **kwargs) 2025-12-04T12:52:45.7421123Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7421361Z with policy(): 2025-12-04T12:52:45.7421586Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7421829Z raise RuntimeError(msg) 2025-12-04T12:52:45.7422322Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7422754Z 2025-12-04T12:52:45.7422834Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7423230Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7423548Z 2025-12-04T12:52:45.7423658Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7423795Z 2025-12-04T12:52:45.7423859Z Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.7423999Z Traceback (most recent call last): 2025-12-04T12:52:45.7424244Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7424489Z getattr(self, test_name)() 2025-12-04T12:52:45.7424730Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7425012Z fn() 2025-12-04T12:52:45.7425225Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7425455Z method(*args, **kwargs) 2025-12-04T12:52:45.7425672Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7425899Z method(*args, **kwargs) 2025-12-04T12:52:45.7426120Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7426347Z with policy(): 2025-12-04T12:52:45.7426564Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7426805Z raise RuntimeError(msg) 2025-12-04T12:52:45.7427273Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7427707Z 2025-12-04T12:52:45.7427792Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7428226Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7428535Z 2025-12-04T12:52:45.7428632Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7428759Z 2025-12-04T12:52:45.7428763Z 2025-12-04T12:52:45.7428853Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7429070Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7429445Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-bb216e1619e0039d.xml - 2025-12-04T12:52:45.7429789Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7430187Z FAILED [9.3139s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.7430559Z Traceback (most recent call last): 2025-12-04T12:52:45.7430817Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7431094Z getattr(self, test_name)() 2025-12-04T12:52:45.7431339Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7431583Z fn() 2025-12-04T12:52:45.7431797Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7432038Z method(*args, **kwargs) 2025-12-04T12:52:45.7432269Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7432509Z method(*args, **kwargs) 2025-12-04T12:52:45.7432753Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7432987Z with policy(): 2025-12-04T12:52:45.7433210Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7433452Z raise RuntimeError(msg) 2025-12-04T12:52:45.7433920Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7434391Z 2025-12-04T12:52:45.7434468Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7434864Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7435177Z 2025-12-04T12:52:45.7435271Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7435404Z 2025-12-04T12:52:45.7435466Z Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.7435618Z Traceback (most recent call last): 2025-12-04T12:52:45.7435870Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7436122Z getattr(self, test_name)() 2025-12-04T12:52:45.7436364Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7436607Z fn() 2025-12-04T12:52:45.7436817Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7437070Z method(*args, **kwargs) 2025-12-04T12:52:45.7437297Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7437526Z method(*args, **kwargs) 2025-12-04T12:52:45.7437748Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7437976Z with policy(): 2025-12-04T12:52:45.7438219Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7438456Z raise RuntimeError(msg) 2025-12-04T12:52:45.7438922Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7439349Z 2025-12-04T12:52:45.7439424Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7439808Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7440115Z 2025-12-04T12:52:45.7440225Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7440418Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7440584Z ============================== 1 failed in 9.32s =============================== 2025-12-04T12:52:45.7440721Z Got exit code 1 2025-12-04T12:52:45.7440823Z Retrying single test... 2025-12-04T12:52:45.7441079Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a748046d038cbc77.xml 2025-12-04T12:52:45.7441366Z ============================= test session starts ============================== 2025-12-04T12:52:45.7441600Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7441792Z cachedir: .pytest_cache 2025-12-04T12:52:45.7442022Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7442268Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7442394Z configfile: pytest.ini 2025-12-04T12:52:45.7442623Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7442926Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.7443301Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7443644Z Running 1 items in this shard 2025-12-04T12:52:45.7443718Z 2025-12-04T12:52:45.7444067Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda I1204 12:47:07.944000 496257 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 496326 2025-12-04T12:52:45.7444603Z I1204 12:47:07.945000 496257 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 496327 2025-12-04T12:52:45.7444948Z I1204 12:47:07.946000 496257 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 496328 2025-12-04T12:52:45.7445295Z I1204 12:47:07.946000 496257 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 496329 2025-12-04T12:52:45.7445850Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7446294Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7446874Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7447462Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7447915Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7448414Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7449004Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7449598Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7450054Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7450493Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7451076Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7451656Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7452108Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7452577Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7453150Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7453734Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7453977Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7454321Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7454814Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7455303Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7455788Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7456240Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7456685Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7457154Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7457621Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7458089Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7458608Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7459062Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7459519Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7459991Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7460721Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7461388Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7461742Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7462415Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7462974Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7463342Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7463760Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7464005Z dist init r=1, world=4 2025-12-04T12:52:45.7464214Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7464557Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7465052Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7465537Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7466019Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7466477Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7466920Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7467387Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7467855Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7468389Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7468858Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7469322Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7469795Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7470263Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7470977Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704. 2025-12-04T12:52:45.7471675Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7472026Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7472667Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7473221Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7473591Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7474008Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7474251Z dist init r=0, world=4 2025-12-04T12:52:45.7474460Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7474804Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7475297Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7475782Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7476264Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7476717Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7477158Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7477639Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7478212Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7478677Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7479160Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7479619Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7480076Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7480565Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7481286Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7481956Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7482308Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7482940Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7483494Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7483862Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7484277Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7484521Z dist init r=2, world=4 2025-12-04T12:52:45.7484729Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7485067Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7485561Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7486045Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7486526Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7486989Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7487433Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7487903Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7488425Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7488898Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7489366Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7489822Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7490309Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7490780Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7491488Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7492151Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7492506Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7493143Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7493691Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7494060Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7494480Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7494723Z dist init r=3, world=4 2025-12-04T12:52:45.7495129Z [rank0]:[W1204 12:47:15.149404416 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7495540Z FAILED [9.1147s] [100%] 2025-12-04T12:52:45.7495605Z 2025-12-04T12:52:45.7495667Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7495904Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda _ 2025-12-04T12:52:45.7496128Z Traceback (most recent call last): 2025-12-04T12:52:45.7496394Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7496640Z self._join_processes(fn) 2025-12-04T12:52:45.7496891Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7497158Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7497427Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7497688Z raise RuntimeError(error) 2025-12-04T12:52:45.7497866Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.7498033Z Traceback (most recent call last): 2025-12-04T12:52:45.7498322Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7498564Z getattr(self, test_name)() 2025-12-04T12:52:45.7498800Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7499050Z fn() 2025-12-04T12:52:45.7499256Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7499506Z method(*args, **kwargs) 2025-12-04T12:52:45.7499732Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7499967Z method(*args, **kwargs) 2025-12-04T12:52:45.7500191Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7500421Z with policy(): 2025-12-04T12:52:45.7500636Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7500869Z raise RuntimeError(msg) 2025-12-04T12:52:45.7501332Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704. 2025-12-04T12:52:45.7501759Z 2025-12-04T12:52:45.7501837Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7502223Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7502531Z 2025-12-04T12:52:45.7502623Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7502749Z 2025-12-04T12:52:45.7502754Z 2025-12-04T12:52:45.7502834Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7503037Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7503395Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a748046d038cbc77.xml - 2025-12-04T12:52:45.7503727Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7504121Z FAILED [9.1147s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.7504490Z Traceback (most recent call last): 2025-12-04T12:52:45.7504742Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7504986Z getattr(self, test_name)() 2025-12-04T12:52:45.7505242Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7505479Z fn() 2025-12-04T12:52:45.7505685Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7505919Z method(*args, **kwargs) 2025-12-04T12:52:45.7506144Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7506374Z method(*args, **kwargs) 2025-12-04T12:52:45.7506905Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7507136Z with policy(): 2025-12-04T12:52:45.7507349Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7507582Z raise RuntimeError(msg) 2025-12-04T12:52:45.7508046Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704. 2025-12-04T12:52:45.7508546Z 2025-12-04T12:52:45.7508621Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7509007Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7509317Z 2025-12-04T12:52:45.7509408Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7509598Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7509768Z ======================= 1 failed, 9 deselected in 9.13s ======================== 2025-12-04T12:52:45.7509909Z Got exit code 1 2025-12-04T12:52:45.7510008Z Retrying single test... 2025-12-04T12:52:45.7510271Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-6fe0148fc13ba808.xml 2025-12-04T12:52:45.7510562Z ============================= test session starts ============================== 2025-12-04T12:52:45.7510776Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7510966Z cachedir: .pytest_cache 2025-12-04T12:52:45.7511193Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7511432Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7511557Z configfile: pytest.ini 2025-12-04T12:52:45.7511790Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7512063Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.7512438Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7512790Z Running 1 items in this shard 2025-12-04T12:52:45.7512863Z 2025-12-04T12:52:45.7513210Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda I1204 12:47:19.735000 496659 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 496728 2025-12-04T12:52:45.7513745Z I1204 12:47:19.736000 496659 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 496729 2025-12-04T12:52:45.7514090Z I1204 12:47:19.737000 496659 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 496730 2025-12-04T12:52:45.7514451Z I1204 12:47:19.738000 496659 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 496731 2025-12-04T12:52:45.7515008Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7515453Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7516051Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7516642Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7517096Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7517566Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7518138Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7518760Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7519218Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7519656Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7520230Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7520816Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7521270Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7521711Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7522285Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7522876Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7523123Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7523476Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7523986Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7524475Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7524959Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7525415Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7525883Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7526359Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7526836Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7527341Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7527816Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7528309Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7528773Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7529248Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7529970Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7530645Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7531005Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7531647Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7532217Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7532594Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7533018Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7533272Z dist init r=3, world=4 2025-12-04T12:52:45.7533508Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7533857Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7534357Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7534846Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7535345Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7535806Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7536256Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7536760Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7537236Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7537706Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7538215Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7538678Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7539150Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7539623Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7540338Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704. 2025-12-04T12:52:45.7541015Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7541374Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7542015Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7542570Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7542953Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7543371Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7543619Z dist init r=0, world=4 2025-12-04T12:52:45.7543826Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7544170Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7544678Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7545166Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7545659Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7546150Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7546600Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7547081Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7547557Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7548030Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7548545Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7549006Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7549472Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7549946Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7550654Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7551320Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7551681Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7552330Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7552884Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7553255Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7553677Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7553922Z dist init r=2, world=4 2025-12-04T12:52:45.7554143Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7554484Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7554971Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7555485Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7555973Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7556425Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7556871Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7557336Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7557804Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7558316Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7558789Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7559243Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7559700Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7560168Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7560877Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7561541Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7561912Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7562547Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7563097Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7563474Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7563889Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7564137Z dist init r=1, world=4 2025-12-04T12:52:45.7564540Z [rank0]:[W1204 12:47:27.919627588 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7564994Z FAILED [9.1145s] [100%] 2025-12-04T12:52:45.7565058Z 2025-12-04T12:52:45.7565121Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7565355Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda _ 2025-12-04T12:52:45.7565580Z Traceback (most recent call last): 2025-12-04T12:52:45.7565839Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7566096Z self._join_processes(fn) 2025-12-04T12:52:45.7566356Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7566633Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7566911Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7567174Z raise RuntimeError(error) 2025-12-04T12:52:45.7567330Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.7567495Z Traceback (most recent call last): 2025-12-04T12:52:45.7567740Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7567987Z getattr(self, test_name)() 2025-12-04T12:52:45.7568265Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7568506Z fn() 2025-12-04T12:52:45.7568711Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7568945Z method(*args, **kwargs) 2025-12-04T12:52:45.7569170Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7569410Z method(*args, **kwargs) 2025-12-04T12:52:45.7569631Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7569861Z with policy(): 2025-12-04T12:52:45.7570079Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7570313Z raise RuntimeError(msg) 2025-12-04T12:52:45.7570795Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704. 2025-12-04T12:52:45.7571223Z 2025-12-04T12:52:45.7571305Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7571698Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7572010Z 2025-12-04T12:52:45.7572106Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7572234Z 2025-12-04T12:52:45.7572300Z Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.7572464Z Traceback (most recent call last): 2025-12-04T12:52:45.7572713Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7572961Z getattr(self, test_name)() 2025-12-04T12:52:45.7573200Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7573438Z fn() 2025-12-04T12:52:45.7573662Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7573913Z method(*args, **kwargs) 2025-12-04T12:52:45.7574133Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7574364Z method(*args, **kwargs) 2025-12-04T12:52:45.7574586Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7574812Z with policy(): 2025-12-04T12:52:45.7575024Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7575255Z raise RuntimeError(msg) 2025-12-04T12:52:45.7575713Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7576141Z 2025-12-04T12:52:45.7576215Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7576600Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7576907Z 2025-12-04T12:52:45.7576995Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7577122Z 2025-12-04T12:52:45.7577124Z 2025-12-04T12:52:45.7577201Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7577405Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7577762Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-6fe0148fc13ba808.xml - 2025-12-04T12:52:45.7578094Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7578516Z FAILED [9.1145s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.7578879Z Traceback (most recent call last): 2025-12-04T12:52:45.7579126Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7579369Z getattr(self, test_name)() 2025-12-04T12:52:45.7579624Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7579862Z fn() 2025-12-04T12:52:45.7580065Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7580296Z method(*args, **kwargs) 2025-12-04T12:52:45.7580517Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7580749Z method(*args, **kwargs) 2025-12-04T12:52:45.7580972Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7581201Z with policy(): 2025-12-04T12:52:45.7581432Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7581663Z raise RuntimeError(msg) 2025-12-04T12:52:45.7582126Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704. 2025-12-04T12:52:45.7582581Z 2025-12-04T12:52:45.7582664Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7583052Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7583359Z 2025-12-04T12:52:45.7583455Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7583579Z 2025-12-04T12:52:45.7583646Z Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.7583795Z Traceback (most recent call last): 2025-12-04T12:52:45.7584045Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7584294Z getattr(self, test_name)() 2025-12-04T12:52:45.7584533Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7584772Z fn() 2025-12-04T12:52:45.7584979Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7585214Z method(*args, **kwargs) 2025-12-04T12:52:45.7585440Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7585676Z method(*args, **kwargs) 2025-12-04T12:52:45.7585899Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7586132Z with policy(): 2025-12-04T12:52:45.7586349Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7586587Z raise RuntimeError(msg) 2025-12-04T12:52:45.7587055Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7587481Z 2025-12-04T12:52:45.7587564Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7587952Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7588301Z 2025-12-04T12:52:45.7588391Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7588606Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7588779Z ======================= 1 failed, 9 deselected in 9.12s ======================== 2025-12-04T12:52:45.7588926Z Got exit code 1 2025-12-04T12:52:45.7589213Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.7589603Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:52:45.7589964Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-32fc4cd2f4792970.xml 2025-12-04T12:52:45.7590271Z ============================= test session starts ============================== 2025-12-04T12:52:45.7590490Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7590685Z cachedir: .pytest_cache 2025-12-04T12:52:45.7590917Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7591161Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7591304Z configfile: pytest.ini 2025-12-04T12:52:45.7591555Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7591831Z collecting ... collected 10 items / 1 deselected / 9 selected 2025-12-04T12:52:45.7591999Z stepcurrent: skipping 1 already run items. 2025-12-04T12:52:45.7592135Z Running 9 items in this shard 2025-12-04T12:52:45.7592212Z 2025-12-04T12:52:45.7592565Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda I1204 12:47:31.524000 497061 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 497130 2025-12-04T12:52:45.7593104Z I1204 12:47:31.525000 497061 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 497131 2025-12-04T12:52:45.7593452Z I1204 12:47:31.526000 497061 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 497132 2025-12-04T12:52:45.7593798Z I1204 12:47:31.526000 497061 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 497133 2025-12-04T12:52:45.7594357Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7594810Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7595395Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7595988Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7596449Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7596891Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7597482Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7598076Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7598571Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7599013Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7599465Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7599902Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7600477Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7601100Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7601690Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7602276Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7602524Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7602877Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7603374Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7603869Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7604359Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7604815Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7605263Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7605737Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7606210Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7606680Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7607163Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7607620Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7608088Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7608597Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7609327Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7609997Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7610354Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7611023Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7611580Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7611950Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7612372Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7612620Z dist init r=1, world=4 2025-12-04T12:52:45.7612832Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7613179Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7613672Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7614156Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7614644Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7615099Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7615545Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7616016Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7616485Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7617000Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7617473Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7617934Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7618434Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7618905Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7619618Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704. 2025-12-04T12:52:45.7620312Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7620672Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7621316Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7621871Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7622243Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7622667Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7622917Z dist init r=0, world=4 2025-12-04T12:52:45.7623130Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7623476Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7623968Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7624459Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7624942Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7625393Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7625840Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7626322Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7626791Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7627261Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7627750Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7628247Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7628708Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7629193Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7629917Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7630583Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7630938Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7631578Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7632134Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7632505Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7632923Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7633171Z dist init r=2, world=4 2025-12-04T12:52:45.7633381Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7633724Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7634223Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7634708Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7649946Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7650497Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7650941Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7651413Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7651895Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7652354Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7652815Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7653263Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7653744Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7654206Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7654921Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7655585Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7655933Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7656566Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7657117Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7657481Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7657892Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7658138Z dist init r=3, world=4 2025-12-04T12:52:45.7658589Z [rank0]:[W1204 12:47:39.764541734 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7658997Z FAILED [9.1139s] [ 11%] 2025-12-04T12:52:45.7659062Z 2025-12-04T12:52:45.7659122Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7659353Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda _ 2025-12-04T12:52:45.7659569Z Traceback (most recent call last): 2025-12-04T12:52:45.7659831Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7660073Z self._join_processes(fn) 2025-12-04T12:52:45.7660320Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7660583Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7660849Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7661108Z raise RuntimeError(error) 2025-12-04T12:52:45.7661290Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.7661449Z Traceback (most recent call last): 2025-12-04T12:52:45.7661686Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7661927Z getattr(self, test_name)() 2025-12-04T12:52:45.7662155Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7662406Z fn() 2025-12-04T12:52:45.7662604Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7662850Z method(*args, **kwargs) 2025-12-04T12:52:45.7663070Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7663296Z method(*args, **kwargs) 2025-12-04T12:52:45.7663514Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7663738Z with policy(): 2025-12-04T12:52:45.7663947Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7664176Z raise RuntimeError(msg) 2025-12-04T12:52:45.7664639Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704. 2025-12-04T12:52:45.7665065Z 2025-12-04T12:52:45.7665140Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7665526Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7665835Z 2025-12-04T12:52:45.7665924Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7666050Z 2025-12-04T12:52:45.7666052Z 2025-12-04T12:52:45.7666133Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7666333Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7666691Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-32fc4cd2f4792970.xml - 2025-12-04T12:52:45.7667021Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7667405Z FAILED [9.1139s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.7667769Z Traceback (most recent call last): 2025-12-04T12:52:45.7668012Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7668298Z getattr(self, test_name)() 2025-12-04T12:52:45.7668551Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7668782Z fn() 2025-12-04T12:52:45.7668982Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7669212Z method(*args, **kwargs) 2025-12-04T12:52:45.7669429Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7669653Z method(*args, **kwargs) 2025-12-04T12:52:45.7669883Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7670107Z with policy(): 2025-12-04T12:52:45.7670314Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7670539Z raise RuntimeError(msg) 2025-12-04T12:52:45.7670998Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704. 2025-12-04T12:52:45.7671448Z 2025-12-04T12:52:45.7671526Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7671909Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7672214Z 2025-12-04T12:52:45.7672305Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7672491Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7672654Z ======================= 1 failed, 1 deselected in 9.12s ======================== 2025-12-04T12:52:45.7672794Z Got exit code 1 2025-12-04T12:52:45.7672890Z Retrying single test... 2025-12-04T12:52:45.7673142Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-04466835009a9b1b.xml 2025-12-04T12:52:45.7673428Z ============================= test session starts ============================== 2025-12-04T12:52:45.7673638Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7673822Z cachedir: .pytest_cache 2025-12-04T12:52:45.7674046Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7674282Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7674399Z configfile: pytest.ini 2025-12-04T12:52:45.7674625Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7674895Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.7675268Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7675614Z Running 1 items in this shard 2025-12-04T12:52:45.7675685Z 2025-12-04T12:52:45.7676030Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda I1204 12:47:43.388000 497463 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 497532 2025-12-04T12:52:45.7676562Z I1204 12:47:43.389000 497463 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 497533 2025-12-04T12:52:45.7676901Z I1204 12:47:43.390000 497463 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 497534 2025-12-04T12:52:45.7677251Z I1204 12:47:43.391000 497463 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 497535 2025-12-04T12:52:45.7677798Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7678291Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7678887Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7679470Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7679925Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7680397Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7680964Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7681544Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7681993Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7682424Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7682986Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7683563Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7684009Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7684437Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7685004Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7685582Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7685819Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7686160Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7686663Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7687138Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7687615Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7688062Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7688552Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7689014Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7689476Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7689965Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7690427Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7690874Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7691324Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7691785Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7692493Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7693153Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7693500Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7694134Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7694686Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7695049Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7695461Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7695699Z dist init r=1, world=4 2025-12-04T12:52:45.7695926Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7696260Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7696744Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7697219Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7697707Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7698203Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7698639Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7699129Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7699589Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7700047Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7700510Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7700957Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7701409Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7701866Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7702566Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7703229Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7703577Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7704207Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7704751Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7705128Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7705540Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7705779Z dist init r=2, world=4 2025-12-04T12:52:45.7705977Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7706309Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7706805Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7707280Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7707751Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7708254Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7708691Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7709152Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7709614Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7710072Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7710531Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7710978Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7711430Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7711890Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7712587Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704. 2025-12-04T12:52:45.7713247Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7713596Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7714244Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7714788Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7715152Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7715563Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7715802Z dist init r=0, world=4 2025-12-04T12:52:45.7716019Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7716355Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7716838Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7717345Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7717822Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7718300Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7718737Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7719195Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7719654Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7720116Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7720575Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7721022Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7721476Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7721938Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7722645Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7723302Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7723659Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7724288Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7724837Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7725208Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7725619Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7725859Z dist init r=3, world=4 2025-12-04T12:52:45.7726260Z [rank0]:[W1204 12:47:51.743317954 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7726698Z FAILED [9.1153s] [100%] 2025-12-04T12:52:45.7726761Z 2025-12-04T12:52:45.7726822Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7727053Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda _ 2025-12-04T12:52:45.7727273Z Traceback (most recent call last): 2025-12-04T12:52:45.7727515Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7727758Z self._join_processes(fn) 2025-12-04T12:52:45.7728002Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7728291Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7728559Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7728820Z raise RuntimeError(error) 2025-12-04T12:52:45.7728974Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.7729134Z Traceback (most recent call last): 2025-12-04T12:52:45.7729373Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7729614Z getattr(self, test_name)() 2025-12-04T12:52:45.7729843Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7730072Z fn() 2025-12-04T12:52:45.7730274Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7730502Z method(*args, **kwargs) 2025-12-04T12:52:45.7730723Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7730952Z method(*args, **kwargs) 2025-12-04T12:52:45.7731167Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7731390Z with policy(): 2025-12-04T12:52:45.7731602Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7731832Z raise RuntimeError(msg) 2025-12-04T12:52:45.7732307Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7732732Z 2025-12-04T12:52:45.7732806Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7733189Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7733499Z 2025-12-04T12:52:45.7733590Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7733715Z 2025-12-04T12:52:45.7733717Z 2025-12-04T12:52:45.7733808Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7734011Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7734367Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-04466835009a9b1b.xml - 2025-12-04T12:52:45.7734693Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7735097Z FAILED [9.1153s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.7735474Z Traceback (most recent call last): 2025-12-04T12:52:45.7735717Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7735955Z getattr(self, test_name)() 2025-12-04T12:52:45.7736189Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7736417Z fn() 2025-12-04T12:52:45.7736616Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7736842Z method(*args, **kwargs) 2025-12-04T12:52:45.7737057Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7737284Z method(*args, **kwargs) 2025-12-04T12:52:45.7737503Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7737724Z with policy(): 2025-12-04T12:52:45.7737935Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7738205Z raise RuntimeError(msg) 2025-12-04T12:52:45.7738667Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7739090Z 2025-12-04T12:52:45.7739164Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7739550Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7739855Z 2025-12-04T12:52:45.7739944Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7740129Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7740291Z ======================= 1 failed, 9 deselected in 9.13s ======================== 2025-12-04T12:52:45.7740427Z Got exit code 1 2025-12-04T12:52:45.7740521Z Retrying single test... 2025-12-04T12:52:45.7740773Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-1a7d0527355a3ac0.xml 2025-12-04T12:52:45.7741067Z ============================= test session starts ============================== 2025-12-04T12:52:45.7741277Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7741466Z cachedir: .pytest_cache 2025-12-04T12:52:45.7741687Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7741925Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7742041Z configfile: pytest.ini 2025-12-04T12:52:45.7742280Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7742550Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.7742924Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7743261Z Running 1 items in this shard 2025-12-04T12:52:45.7743332Z 2025-12-04T12:52:45.7743677Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda I1204 12:47:55.370000 497865 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 497934 2025-12-04T12:52:45.7744253Z I1204 12:47:55.371000 497865 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 497935 2025-12-04T12:52:45.7744595Z I1204 12:47:55.372000 497865 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 497936 2025-12-04T12:52:45.7744938Z I1204 12:47:55.372000 497865 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 497937 2025-12-04T12:52:45.7745490Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7745930Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7746507Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7747093Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7747542Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7747976Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7748587Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7749173Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7749617Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7750046Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7750485Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7750915Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7751491Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7752070Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7752652Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7753243Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7753495Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7753832Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7754321Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7754799Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7755274Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7755719Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7756156Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7756617Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7757082Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7757539Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7758004Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7758533Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7758984Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7759459Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7760168Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7760831Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7761193Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7761825Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7762375Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7762771Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7763183Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7763421Z dist init r=3, world=4 2025-12-04T12:52:45.7763624Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7763956Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7764440Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7764916Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7765392Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7765839Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7766277Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7766739Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7767204Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7767663Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7768123Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7768606Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7769075Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7769538Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7770251Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2462056448 and is now 3307208704. 2025-12-04T12:52:45.7770912Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7771260Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7771902Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7772462Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7772822Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7773234Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7773474Z dist init r=0, world=4 2025-12-04T12:52:45.7773678Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7774013Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7774493Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7774967Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7775440Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7775881Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7776158Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7776306Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7776584Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7776731Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7777024Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7777162Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7777441Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7777598Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7778113Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7778277Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7778486Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7778885Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7778997Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7779216Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7779387Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7779428Z dist init r=1, world=4 2025-12-04T12:52:45.7779568Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7779728Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7780016Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7780171Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7780457Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7780583Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7780864Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7781012Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7781300Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7781450Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7781726Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7781876Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7782154Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7782308Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7782827Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7782965Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7783163Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7783562Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7783678Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7783892Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7784058Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7784098Z dist init r=2, world=4 2025-12-04T12:52:45.7784437Z [rank0]:[W1204 12:48:02.634762525 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7784479Z FAILED [9.2148s] [100%] 2025-12-04T12:52:45.7784482Z 2025-12-04T12:52:45.7784541Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7784682Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda _ 2025-12-04T12:52:45.7784730Z Traceback (most recent call last): 2025-12-04T12:52:45.7784897Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7784940Z self._join_processes(fn) 2025-12-04T12:52:45.7785115Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7785170Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7785361Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7785405Z raise RuntimeError(error) 2025-12-04T12:52:45.7785488Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.7785534Z Traceback (most recent call last): 2025-12-04T12:52:45.7785697Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7785740Z getattr(self, test_name)() 2025-12-04T12:52:45.7785897Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7785935Z fn() 2025-12-04T12:52:45.7786102Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7786146Z method(*args, **kwargs) 2025-12-04T12:52:45.7786298Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7786340Z method(*args, **kwargs) 2025-12-04T12:52:45.7786490Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7786551Z with policy(): 2025-12-04T12:52:45.7786704Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7786746Z raise RuntimeError(msg) 2025-12-04T12:52:45.7787141Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2462056448 and is now 3307208704. 2025-12-04T12:52:45.7787144Z 2025-12-04T12:52:45.7787221Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7787496Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7787500Z 2025-12-04T12:52:45.7787588Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7787591Z 2025-12-04T12:52:45.7787653Z Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.7787697Z Traceback (most recent call last): 2025-12-04T12:52:45.7787861Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7787904Z getattr(self, test_name)() 2025-12-04T12:52:45.7788066Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7788101Z fn() 2025-12-04T12:52:45.7788294Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7788334Z method(*args, **kwargs) 2025-12-04T12:52:45.7788486Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7788527Z method(*args, **kwargs) 2025-12-04T12:52:45.7788679Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7788716Z with policy(): 2025-12-04T12:52:45.7788870Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7788912Z raise RuntimeError(msg) 2025-12-04T12:52:45.7789318Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7789320Z 2025-12-04T12:52:45.7789395Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7789670Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7789673Z 2025-12-04T12:52:45.7789764Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7789766Z 2025-12-04T12:52:45.7789768Z 2025-12-04T12:52:45.7789845Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7789948Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7790180Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-1a7d0527355a3ac0.xml - 2025-12-04T12:52:45.7790243Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7790527Z FAILED [9.2148s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.7790605Z Traceback (most recent call last): 2025-12-04T12:52:45.7790769Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7790815Z getattr(self, test_name)() 2025-12-04T12:52:45.7790981Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7791015Z fn() 2025-12-04T12:52:45.7791170Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7791210Z method(*args, **kwargs) 2025-12-04T12:52:45.7791363Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7791404Z method(*args, **kwargs) 2025-12-04T12:52:45.7791556Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7791594Z with policy(): 2025-12-04T12:52:45.7791749Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7791790Z raise RuntimeError(msg) 2025-12-04T12:52:45.7792183Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2462056448 and is now 3307208704. 2025-12-04T12:52:45.7792186Z 2025-12-04T12:52:45.7792260Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7792537Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7792541Z 2025-12-04T12:52:45.7792628Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7792633Z 2025-12-04T12:52:45.7792694Z Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.7792742Z Traceback (most recent call last): 2025-12-04T12:52:45.7792904Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7792949Z getattr(self, test_name)() 2025-12-04T12:52:45.7793107Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7793155Z fn() 2025-12-04T12:52:45.7793306Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7793349Z method(*args, **kwargs) 2025-12-04T12:52:45.7793500Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7793541Z method(*args, **kwargs) 2025-12-04T12:52:45.7793689Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7793729Z with policy(): 2025-12-04T12:52:45.7793889Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7793933Z raise RuntimeError(msg) 2025-12-04T12:52:45.7794320Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7794332Z 2025-12-04T12:52:45.7794408Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7794690Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7794692Z 2025-12-04T12:52:45.7794779Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7794848Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7794911Z ======================= 1 failed, 9 deselected in 9.22s ======================== 2025-12-04T12:52:45.7794951Z Got exit code 1 2025-12-04T12:52:45.7795173Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.7795305Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:52:45.7795496Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-3614e4f517affde8.xml 2025-12-04T12:52:45.7795557Z ============================= test session starts ============================== 2025-12-04T12:52:45.7795670Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7795712Z cachedir: .pytest_cache 2025-12-04T12:52:45.7795869Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7795918Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7795958Z configfile: pytest.ini 2025-12-04T12:52:45.7796123Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7796196Z collecting ... collected 10 items / 2 deselected / 8 selected 2025-12-04T12:52:45.7796252Z stepcurrent: skipping 2 already run items. 2025-12-04T12:52:45.7796297Z Running 8 items in this shard 2025-12-04T12:52:45.7796299Z 2025-12-04T12:52:45.7796645Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda I1204 12:48:07.244000 498267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 498336 2025-12-04T12:52:45.7796801Z I1204 12:48:07.245000 498267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 498337 2025-12-04T12:52:45.7796953Z I1204 12:48:07.246000 498267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 498338 2025-12-04T12:52:45.7797115Z I1204 12:48:07.246000 498267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 498339 2025-12-04T12:52:45.7797475Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7797529Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7798031Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7798097Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7798496Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7798571Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7799061Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7799122Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7799477Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7799524Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7800012Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7800075Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7800427Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7800474Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7800962Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7801023Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7801166Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7801329Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7801640Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7801794Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7802083Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7802209Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7802500Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7802651Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7802930Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7803099Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7803374Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7803512Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7803790Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7803939Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7804457Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504. 2025-12-04T12:52:45.7804574Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7804771Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7805170Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7805292Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7805507Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7805674Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7805712Z dist init r=3, world=4 2025-12-04T12:52:45.7805862Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7806022Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7806314Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7806468Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7806762Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7806891Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7807167Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7807342Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7807619Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7807770Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7808050Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7808228Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7808509Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7808658Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7809177Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7809294Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7809489Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7809887Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7810001Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7810229Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7810393Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7810434Z dist init r=2, world=4 2025-12-04T12:52:45.7810573Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7810735Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7811035Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7811189Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7811475Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7811621Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7811900Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7812048Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7812325Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7812471Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7812746Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7812886Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7813163Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7813311Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7813825Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7813943Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7814141Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7814545Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7814659Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7814870Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7815037Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7815075Z dist init r=1, world=4 2025-12-04T12:52:45.7815225Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7815383Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7815670Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7815837Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7816136Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7816261Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7816536Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7816685Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7816960Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7817110Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7817387Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7817524Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7817804Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7817952Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7818500Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704. 2025-12-04T12:52:45.7818613Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7818820Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7819219Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7819332Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7819557Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7819721Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7819763Z dist init r=0, world=4 2025-12-04T12:52:45.7820100Z [rank0]:[W1204 12:48:15.893943547 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7820165Z FAILED [9.2145s] [ 12%] 2025-12-04T12:52:45.7820168Z 2025-12-04T12:52:45.7820227Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7820362Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda _ 2025-12-04T12:52:45.7820411Z Traceback (most recent call last): 2025-12-04T12:52:45.7820574Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7820620Z self._join_processes(fn) 2025-12-04T12:52:45.7820792Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7820850Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7821030Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7821076Z raise RuntimeError(error) 2025-12-04T12:52:45.7821158Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.7821207Z Traceback (most recent call last): 2025-12-04T12:52:45.7821368Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7821414Z getattr(self, test_name)() 2025-12-04T12:52:45.7821573Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7821612Z fn() 2025-12-04T12:52:45.7821763Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7821806Z method(*args, **kwargs) 2025-12-04T12:52:45.7821958Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7822000Z method(*args, **kwargs) 2025-12-04T12:52:45.7822150Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7822190Z with policy(): 2025-12-04T12:52:45.7822341Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7822386Z raise RuntimeError(msg) 2025-12-04T12:52:45.7822784Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504. 2025-12-04T12:52:45.7822787Z 2025-12-04T12:52:45.7822861Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7823133Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7823137Z 2025-12-04T12:52:45.7823225Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7823227Z 2025-12-04T12:52:45.7823229Z 2025-12-04T12:52:45.7823317Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7823405Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7823643Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-3614e4f517affde8.xml - 2025-12-04T12:52:45.7823704Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7823996Z FAILED [9.2145s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.7824055Z Traceback (most recent call last): 2025-12-04T12:52:45.7824219Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7824265Z getattr(self, test_name)() 2025-12-04T12:52:45.7824426Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7824465Z fn() 2025-12-04T12:52:45.7824617Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7824660Z method(*args, **kwargs) 2025-12-04T12:52:45.7824811Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7824854Z method(*args, **kwargs) 2025-12-04T12:52:45.7825006Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7825045Z with policy(): 2025-12-04T12:52:45.7825196Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7825240Z raise RuntimeError(msg) 2025-12-04T12:52:45.7825629Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504. 2025-12-04T12:52:45.7825631Z 2025-12-04T12:52:45.7825708Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7825981Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7825984Z 2025-12-04T12:52:45.7826072Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7826137Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7826200Z ======================= 1 failed, 2 deselected in 9.22s ======================== 2025-12-04T12:52:45.7826240Z Got exit code 1 2025-12-04T12:52:45.7826280Z Retrying single test... 2025-12-04T12:52:45.7826470Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-49798aeb29a97079.xml 2025-12-04T12:52:45.7826540Z ============================= test session starts ============================== 2025-12-04T12:52:45.7826656Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7826699Z cachedir: .pytest_cache 2025-12-04T12:52:45.7826860Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7826906Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7826949Z configfile: pytest.ini 2025-12-04T12:52:45.7827112Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7827198Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.7827461Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7827509Z Running 1 items in this shard 2025-12-04T12:52:45.7827511Z 2025-12-04T12:52:45.7827856Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda I1204 12:48:19.254000 498669 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 498738 2025-12-04T12:52:45.7828034Z I1204 12:48:19.255000 498669 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 498739 2025-12-04T12:52:45.7828223Z I1204 12:48:19.256000 498669 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 498740 2025-12-04T12:52:45.7828375Z I1204 12:48:19.256000 498669 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 498741 2025-12-04T12:52:45.7828739Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7828788Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7829284Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7829349Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7829706Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7829754Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7830242Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7830306Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7830657Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7830706Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7831207Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7831269Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7831634Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7831679Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7832169Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7832249Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7832406Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7832568Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7832858Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7833015Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7833301Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7833429Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7833705Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7833857Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7834135Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7834288Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7834571Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7834712Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7834993Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7835153Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7835672Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7835794Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7836002Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7836409Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7836524Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7836752Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7836930Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7836975Z dist init r=2, world=4 2025-12-04T12:52:45.7837115Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7837279Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7837572Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7837728Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7838019Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7838173Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7838458Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7838607Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7838889Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7839041Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7839316Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7839455Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7839748Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7839901Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7840428Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7840546Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7840746Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7841148Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7841291Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7841503Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7841670Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7841708Z dist init r=3, world=4 2025-12-04T12:52:45.7841848Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7842009Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7842299Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7842455Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7842738Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7842865Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7843144Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7843298Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7843575Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7843726Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7844016Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7844154Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7844436Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7844597Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7845112Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704. 2025-12-04T12:52:45.7845239Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7845444Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7845846Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7845960Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7846175Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7846340Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7846383Z dist init r=0, world=4 2025-12-04T12:52:45.7846521Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7846683Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7846972Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7847126Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7847412Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7847538Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7847818Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7847967Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7848379Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7848527Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7848805Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7848943Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7849238Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7849391Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7849903Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7850045Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7850245Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7850647Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7850765Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7850977Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7851144Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7851185Z dist init r=1, world=4 2025-12-04T12:52:45.7851527Z [rank0]:[W1204 12:48:27.803446166 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7851568Z FAILED [9.3150s] [100%] 2025-12-04T12:52:45.7851572Z 2025-12-04T12:52:45.7851629Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7851768Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda _ 2025-12-04T12:52:45.7851815Z Traceback (most recent call last): 2025-12-04T12:52:45.7851980Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7852025Z self._join_processes(fn) 2025-12-04T12:52:45.7852200Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7852255Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7852446Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7852491Z raise RuntimeError(error) 2025-12-04T12:52:45.7852574Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.7852621Z Traceback (most recent call last): 2025-12-04T12:52:45.7852787Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7852830Z getattr(self, test_name)() 2025-12-04T12:52:45.7852991Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7853024Z fn() 2025-12-04T12:52:45.7853187Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7853229Z method(*args, **kwargs) 2025-12-04T12:52:45.7853383Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7853424Z method(*args, **kwargs) 2025-12-04T12:52:45.7853575Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7853622Z with policy(): 2025-12-04T12:52:45.7853787Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7853827Z raise RuntimeError(msg) 2025-12-04T12:52:45.7854217Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7854220Z 2025-12-04T12:52:45.7854296Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7854566Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7854569Z 2025-12-04T12:52:45.7854659Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7854663Z 2025-12-04T12:52:45.7854664Z 2025-12-04T12:52:45.7854740Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7854828Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7855064Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-49798aeb29a97079.xml - 2025-12-04T12:52:45.7855127Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7855412Z FAILED [9.3150s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.7855458Z Traceback (most recent call last): 2025-12-04T12:52:45.7855622Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7855665Z getattr(self, test_name)() 2025-12-04T12:52:45.7855827Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7855862Z fn() 2025-12-04T12:52:45.7856017Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7856056Z method(*args, **kwargs) 2025-12-04T12:52:45.7856208Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7856248Z method(*args, **kwargs) 2025-12-04T12:52:45.7856411Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7856449Z with policy(): 2025-12-04T12:52:45.7856605Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7856647Z raise RuntimeError(msg) 2025-12-04T12:52:45.7857046Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7857048Z 2025-12-04T12:52:45.7857122Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7857397Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7857399Z 2025-12-04T12:52:45.7857490Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7857565Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7857644Z ======================= 1 failed, 9 deselected in 9.33s ======================== 2025-12-04T12:52:45.7857683Z Got exit code 1 2025-12-04T12:52:45.7857728Z Retrying single test... 2025-12-04T12:52:45.7857917Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-5e212175c44e621e.xml 2025-12-04T12:52:45.7857978Z ============================= test session starts ============================== 2025-12-04T12:52:45.7858090Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7858134Z cachedir: .pytest_cache 2025-12-04T12:52:45.7858330Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7858379Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7858422Z configfile: pytest.ini 2025-12-04T12:52:45.7858589Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7858664Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.7858930Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7858974Z Running 1 items in this shard 2025-12-04T12:52:45.7858979Z 2025-12-04T12:52:45.7859321Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda I1204 12:48:31.253000 499071 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 499140 2025-12-04T12:52:45.7859478Z I1204 12:48:31.254000 499071 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 499141 2025-12-04T12:52:45.7859633Z I1204 12:48:31.255000 499071 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 499142 2025-12-04T12:52:45.7859786Z I1204 12:48:31.255000 499071 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 499143 2025-12-04T12:52:45.7860148Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7860200Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7860707Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7860774Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7861143Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7861191Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7861681Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7861754Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7862124Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7862171Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7862659Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7862721Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7863071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7863124Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7863610Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7863673Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7863817Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7863983Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7864277Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7864434Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7864727Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7864873Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7865154Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7865304Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7865594Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7865742Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7866018Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7866166Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7866453Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7866606Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7867123Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7867241Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7867440Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7867842Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7867960Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7868221Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7868388Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7868429Z dist init r=2, world=4 2025-12-04T12:52:45.7868569Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7868729Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7869019Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7869188Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7869475Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7869604Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7869892Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7870042Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7870318Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7870468Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7870776Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7870914Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7871192Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7871340Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7871855Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704. 2025-12-04T12:52:45.7871976Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7872172Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7872568Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7872682Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7872895Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7873059Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7873105Z dist init r=0, world=4 2025-12-04T12:52:45.7873244Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7873416Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7873703Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7873859Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7874144Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7874276Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7874553Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7874699Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7874993Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7875140Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7875414Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7875554Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7875831Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7875984Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7876497Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7876615Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7876813Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7877210Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7877328Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7877539Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7877715Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7877752Z dist init r=1, world=4 2025-12-04T12:52:45.7877896Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7878059Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7878384Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7878552Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7878837Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7878960Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7879249Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7879417Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7879692Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7879841Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7880118Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7880258Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7880537Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7880687Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7881201Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7881322Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7881517Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7881914Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7882040Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7882251Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7882415Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7882456Z dist init r=3, world=4 2025-12-04T12:52:45.7882803Z [rank0]:[W1204 12:48:39.732262120 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7882845Z FAILED [9.3159s] [100%] 2025-12-04T12:52:45.7882847Z 2025-12-04T12:52:45.7882903Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7883036Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda _ 2025-12-04T12:52:45.7883082Z Traceback (most recent call last): 2025-12-04T12:52:45.7883244Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7883308Z self._join_processes(fn) 2025-12-04T12:52:45.7883481Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7883538Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7883718Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7883765Z raise RuntimeError(error) 2025-12-04T12:52:45.7883847Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.7883896Z Traceback (most recent call last): 2025-12-04T12:52:45.7884060Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7884105Z getattr(self, test_name)() 2025-12-04T12:52:45.7884265Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7884302Z fn() 2025-12-04T12:52:45.7884453Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7884498Z method(*args, **kwargs) 2025-12-04T12:52:45.7884650Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7884694Z method(*args, **kwargs) 2025-12-04T12:52:45.7884847Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7884888Z with policy(): 2025-12-04T12:52:45.7885042Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7885086Z raise RuntimeError(msg) 2025-12-04T12:52:45.7885478Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7885482Z 2025-12-04T12:52:45.7885557Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7885832Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7885834Z 2025-12-04T12:52:45.7885923Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7885937Z 2025-12-04T12:52:45.7885939Z 2025-12-04T12:52:45.7886019Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7886108Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7886343Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-5e212175c44e621e.xml - 2025-12-04T12:52:45.7886403Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7886698Z FAILED [9.3159s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.7886749Z Traceback (most recent call last): 2025-12-04T12:52:45.7886917Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7886963Z getattr(self, test_name)() 2025-12-04T12:52:45.7887123Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7887179Z fn() 2025-12-04T12:52:45.7887331Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7887375Z method(*args, **kwargs) 2025-12-04T12:52:45.7887526Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7887571Z method(*args, **kwargs) 2025-12-04T12:52:45.7887723Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7887764Z with policy(): 2025-12-04T12:52:45.7887918Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7887961Z raise RuntimeError(msg) 2025-12-04T12:52:45.7888399Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7888403Z 2025-12-04T12:52:45.7888482Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7888757Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7888759Z 2025-12-04T12:52:45.7888847Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7888915Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7888978Z ======================= 1 failed, 9 deselected in 9.33s ======================== 2025-12-04T12:52:45.7889018Z Got exit code 1 2025-12-04T12:52:45.7889238Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.7889371Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:52:45.7889560Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-ee7066dd84237162.xml 2025-12-04T12:52:45.7889622Z ============================= test session starts ============================== 2025-12-04T12:52:45.7889733Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7889778Z cachedir: .pytest_cache 2025-12-04T12:52:45.7889955Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7890004Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7890045Z configfile: pytest.ini 2025-12-04T12:52:45.7890210Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7890284Z collecting ... collected 10 items / 3 deselected / 7 selected 2025-12-04T12:52:45.7890339Z stepcurrent: skipping 3 already run items. 2025-12-04T12:52:45.7890384Z Running 7 items in this shard 2025-12-04T12:52:45.7890388Z 2025-12-04T12:52:45.7890745Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda I1204 12:48:43.271000 499473 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 499542 2025-12-04T12:52:45.7890905Z I1204 12:48:43.272000 499473 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 499543 2025-12-04T12:52:45.7891057Z I1204 12:48:43.273000 499473 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 499544 2025-12-04T12:52:45.7891225Z I1204 12:48:43.273000 499473 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 499545 2025-12-04T12:52:45.7891597Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7891649Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7892143Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7896293Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7896660Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7896709Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7897206Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7897267Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7897622Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7897670Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7898195Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7898256Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7898638Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7898687Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7899190Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7899249Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7899393Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7899558Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7899851Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7900044Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7900335Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7900459Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7900741Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7900890Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7901171Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7901320Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7901595Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7901733Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7902010Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7902160Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7902678Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7902804Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7903000Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7903401Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7903526Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7903740Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7903906Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7903945Z dist init r=3, world=4 2025-12-04T12:52:45.7904093Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7904264Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7904551Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7904704Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7904987Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7905112Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7905389Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7905536Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7905812Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7905961Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7906237Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7906373Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7906651Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7906801Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7907323Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7907439Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7907635Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7908041Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7908194Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7908407Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7908596Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7908635Z dist init r=1, world=4 2025-12-04T12:52:45.7908771Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7908933Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7909219Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7909373Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7909659Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7909783Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7910059Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7910207Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7910481Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7910628Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7910904Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7911039Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7911324Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7911473Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7912000Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2462056448 and is now 3307208704. 2025-12-04T12:52:45.7912115Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7912312Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7912707Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7912840Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7913051Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7913216Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7913253Z dist init r=0, world=4 2025-12-04T12:52:45.7913391Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7913550Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7913836Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7913992Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7914274Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7914399Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7914673Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7914823Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7915099Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7915247Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7915532Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7915668Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7915945Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7916093Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7916619Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7916733Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7916940Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7917350Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7917463Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7917673Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7917836Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7917875Z dist init r=2, world=4 2025-12-04T12:52:45.7918247Z [rank0]:[W1204 12:48:50.527016978 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7918288Z FAILED [9.1147s] [ 14%] 2025-12-04T12:52:45.7918290Z 2025-12-04T12:52:45.7918349Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7918483Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda _ 2025-12-04T12:52:45.7918530Z Traceback (most recent call last): 2025-12-04T12:52:45.7918694Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7918738Z self._join_processes(fn) 2025-12-04T12:52:45.7918911Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7918967Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7919144Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7919188Z raise RuntimeError(error) 2025-12-04T12:52:45.7919271Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.7919316Z Traceback (most recent call last): 2025-12-04T12:52:45.7919478Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7919521Z getattr(self, test_name)() 2025-12-04T12:52:45.7919698Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7919734Z fn() 2025-12-04T12:52:45.7919885Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7919928Z method(*args, **kwargs) 2025-12-04T12:52:45.7920077Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7920118Z method(*args, **kwargs) 2025-12-04T12:52:45.7920281Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7920321Z with policy(): 2025-12-04T12:52:45.7920472Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7920514Z raise RuntimeError(msg) 2025-12-04T12:52:45.7920903Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7920935Z 2025-12-04T12:52:45.7921009Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7921280Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7921284Z 2025-12-04T12:52:45.7921372Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7921374Z 2025-12-04T12:52:45.7921376Z 2025-12-04T12:52:45.7921453Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7921542Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7921778Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-ee7066dd84237162.xml - 2025-12-04T12:52:45.7921841Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7922128Z FAILED [9.1147s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.7922175Z Traceback (most recent call last): 2025-12-04T12:52:45.7922340Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7922383Z getattr(self, test_name)() 2025-12-04T12:52:45.7922542Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7922579Z fn() 2025-12-04T12:52:45.7922730Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7922775Z method(*args, **kwargs) 2025-12-04T12:52:45.7922925Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7922965Z method(*args, **kwargs) 2025-12-04T12:52:45.7923115Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7923153Z with policy(): 2025-12-04T12:52:45.7923304Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7923346Z raise RuntimeError(msg) 2025-12-04T12:52:45.7924688Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504. 2025-12-04T12:52:45.7924693Z 2025-12-04T12:52:45.7924768Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7925037Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7925040Z 2025-12-04T12:52:45.7925137Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7925202Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7925264Z ======================= 1 failed, 3 deselected in 9.12s ======================== 2025-12-04T12:52:45.7925302Z Got exit code 1 2025-12-04T12:52:45.7925343Z Retrying single test... 2025-12-04T12:52:45.7925531Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-036f49a76ee38524.xml 2025-12-04T12:52:45.7925602Z ============================= test session starts ============================== 2025-12-04T12:52:45.7925725Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7925766Z cachedir: .pytest_cache 2025-12-04T12:52:45.7925923Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7925970Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7926010Z configfile: pytest.ini 2025-12-04T12:52:45.7926173Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7926248Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.7926509Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7926555Z Running 1 items in this shard 2025-12-04T12:52:45.7926557Z 2025-12-04T12:52:45.7926900Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda I1204 12:48:55.120000 499875 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 499944 2025-12-04T12:52:45.7927057Z I1204 12:48:55.122000 499875 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 499945 2025-12-04T12:52:45.7927208Z I1204 12:48:55.122000 499875 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 499946 2025-12-04T12:52:45.7927358Z I1204 12:48:55.123000 499875 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 499947 2025-12-04T12:52:45.7927717Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7927767Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7928121Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7928197Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7928706Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7928772Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7929272Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7929333Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7929687Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7929733Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7930230Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7930306Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7930658Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7930704Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7931192Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7931252Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7931396Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7931558Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7931850Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7932004Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7932290Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7932414Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7932690Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7932853Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7933129Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7933278Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7933555Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7933702Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7933980Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7934127Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7934660Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7934776Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7934972Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7935370Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7935485Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7935696Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7935861Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7935900Z dist init r=1, world=4 2025-12-04T12:52:45.7936038Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7936197Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7936483Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7936636Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7936920Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7937044Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7937328Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7937478Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7937753Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7937912Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7938223Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7938359Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7938647Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7938811Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7939327Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504. 2025-12-04T12:52:45.7939443Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7939639Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7940037Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7940150Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7940361Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7940524Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7940563Z dist init r=3, world=4 2025-12-04T12:52:45.7940702Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7940860Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7941147Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7941300Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7941597Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7941724Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7941999Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7942158Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7942433Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7942580Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7942863Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7943009Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7943286Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7943433Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7943948Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7944063Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7944261Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7944659Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7944772Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7944983Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7945145Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7945184Z dist init r=2, world=4 2025-12-04T12:52:45.7945322Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7945481Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7945776Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7945932Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7946219Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7946353Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7946632Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7946778Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7947054Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7947225Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7947501Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7947636Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7947912Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7948062Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7948618Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704. 2025-12-04T12:52:45.7948733Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7948928Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7949324Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7949439Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7949650Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7949813Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7949851Z dist init r=0, world=4 2025-12-04T12:52:45.7950205Z [rank0]:[W1204 12:49:03.807760409 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7950246Z FAILED [9.2132s] [100%] 2025-12-04T12:52:45.7950248Z 2025-12-04T12:52:45.7950305Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7950440Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda _ 2025-12-04T12:52:45.7950486Z Traceback (most recent call last): 2025-12-04T12:52:45.7950663Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7950706Z self._join_processes(fn) 2025-12-04T12:52:45.7950881Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7950935Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7951114Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7951183Z raise RuntimeError(error) 2025-12-04T12:52:45.7951265Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.7951311Z Traceback (most recent call last): 2025-12-04T12:52:45.7951472Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7951514Z getattr(self, test_name)() 2025-12-04T12:52:45.7951673Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7951707Z fn() 2025-12-04T12:52:45.7951860Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7951900Z method(*args, **kwargs) 2025-12-04T12:52:45.7952051Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7952094Z method(*args, **kwargs) 2025-12-04T12:52:45.7952246Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7952282Z with policy(): 2025-12-04T12:52:45.7952435Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7952477Z raise RuntimeError(msg) 2025-12-04T12:52:45.7952869Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7952871Z 2025-12-04T12:52:45.7952947Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7953220Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7953223Z 2025-12-04T12:52:45.7953312Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7953314Z 2025-12-04T12:52:45.7953373Z Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.7953420Z Traceback (most recent call last): 2025-12-04T12:52:45.7953582Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7953625Z getattr(self, test_name)() 2025-12-04T12:52:45.7953795Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7953830Z fn() 2025-12-04T12:52:45.7953980Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7954022Z method(*args, **kwargs) 2025-12-04T12:52:45.7954171Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7954211Z method(*args, **kwargs) 2025-12-04T12:52:45.7954362Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7954409Z with policy(): 2025-12-04T12:52:45.7954559Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7954601Z raise RuntimeError(msg) 2025-12-04T12:52:45.7954989Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504. 2025-12-04T12:52:45.7955010Z 2025-12-04T12:52:45.7955084Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7955353Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7955355Z 2025-12-04T12:52:45.7955443Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7955445Z 2025-12-04T12:52:45.7955447Z 2025-12-04T12:52:45.7955523Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7955612Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7955847Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-036f49a76ee38524.xml - 2025-12-04T12:52:45.7955909Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7956192Z FAILED [9.2132s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.7956240Z Traceback (most recent call last): 2025-12-04T12:52:45.7956405Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7956450Z getattr(self, test_name)() 2025-12-04T12:52:45.7956608Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7956645Z fn() 2025-12-04T12:52:45.7956797Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7956840Z method(*args, **kwargs) 2025-12-04T12:52:45.7956991Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7957034Z method(*args, **kwargs) 2025-12-04T12:52:45.7957183Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7957221Z with policy(): 2025-12-04T12:52:45.7957372Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7957414Z raise RuntimeError(msg) 2025-12-04T12:52:45.7957813Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7957817Z 2025-12-04T12:52:45.7957890Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7958211Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7958213Z 2025-12-04T12:52:45.7958302Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7958304Z 2025-12-04T12:52:45.7958380Z Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.7958426Z Traceback (most recent call last): 2025-12-04T12:52:45.7958587Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7958630Z getattr(self, test_name)() 2025-12-04T12:52:45.7958790Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7958837Z fn() 2025-12-04T12:52:45.7959002Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7959043Z method(*args, **kwargs) 2025-12-04T12:52:45.7959196Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7959236Z method(*args, **kwargs) 2025-12-04T12:52:45.7959387Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7959423Z with policy(): 2025-12-04T12:52:45.7959578Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7959619Z raise RuntimeError(msg) 2025-12-04T12:52:45.7960006Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504. 2025-12-04T12:52:45.7960010Z 2025-12-04T12:52:45.7960084Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7960355Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7960357Z 2025-12-04T12:52:45.7960445Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7960509Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7960576Z ======================= 1 failed, 9 deselected in 9.22s ======================== 2025-12-04T12:52:45.7960614Z Got exit code 1 2025-12-04T12:52:45.7960655Z Retrying single test... 2025-12-04T12:52:45.7960845Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-e156b6b4e48e5ac7.xml 2025-12-04T12:52:45.7960905Z ============================= test session starts ============================== 2025-12-04T12:52:45.7961016Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7961059Z cachedir: .pytest_cache 2025-12-04T12:52:45.7961216Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7961263Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7961303Z configfile: pytest.ini 2025-12-04T12:52:45.7961488Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7961564Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.7961828Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7961876Z Running 1 items in this shard 2025-12-04T12:52:45.7961878Z 2025-12-04T12:52:45.7962228Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda I1204 12:49:06.982000 500277 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 500346 2025-12-04T12:52:45.7962383Z I1204 12:49:06.983000 500277 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 500347 2025-12-04T12:52:45.7962536Z I1204 12:49:06.983000 500277 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 500348 2025-12-04T12:52:45.7962688Z I1204 12:49:06.984000 500277 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 500349 2025-12-04T12:52:45.7963063Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7963122Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7963617Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7963680Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7964036Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7964085Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7964437Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7964483Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7964973Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7965034Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7965520Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7965581Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7965943Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T12:52:45.7965990Z self.encoder = TransformerEncoder( 2025-12-04T12:52:45.7966477Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7966536Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7966688Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7966852Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7967145Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7967315Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7967612Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7967738Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7968015Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7968212Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7968489Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7968639Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7968914Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7969054Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7969334Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7969487Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7970005Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504. 2025-12-04T12:52:45.7970121Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7970333Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7970732Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7970849Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7971075Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7971240Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.7971281Z dist init r=3, world=4 2025-12-04T12:52:45.7971420Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7971592Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7971892Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7972046Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7972329Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7972455Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7972731Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7972880Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7973155Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7973300Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7973575Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7973711Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7973991Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7974141Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7974666Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7974783Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7974978Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7975388Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7975502Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7975715Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7975890Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.7975939Z dist init r=2, world=4 2025-12-04T12:52:45.7976079Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7976238Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7976529Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7976682Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7976967Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7977092Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7977369Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7977517Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7977793Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7977939Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7978249Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7978385Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7978662Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7978827Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7979339Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7979455Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7979670Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7980066Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7980192Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7980416Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7980581Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.7980622Z dist init r=1, world=4 2025-12-04T12:52:45.7980760Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.7980922Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.7981208Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7981367Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.7981651Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7981775Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.7982049Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7982199Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7982474Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7982620Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.7982895Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7983045Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.7983323Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7983475Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.7983996Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704. 2025-12-04T12:52:45.7984113Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7984307Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7984724Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7984839Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.7985050Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7985216Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.7985253Z dist init r=0, world=4 2025-12-04T12:52:45.7985589Z [rank0]:[W1204 12:49:14.562640898 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.7985629Z FAILED [9.5161s] [100%] 2025-12-04T12:52:45.7985631Z 2025-12-04T12:52:45.7985688Z =================================== FAILURES =================================== 2025-12-04T12:52:45.7985824Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda _ 2025-12-04T12:52:45.7985870Z Traceback (most recent call last): 2025-12-04T12:52:45.7986032Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.7986077Z self._join_processes(fn) 2025-12-04T12:52:45.7986250Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.7986308Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.7986486Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.7986531Z raise RuntimeError(error) 2025-12-04T12:52:45.7986611Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.7986658Z Traceback (most recent call last): 2025-12-04T12:52:45.7986818Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7986864Z getattr(self, test_name)() 2025-12-04T12:52:45.7987022Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7987067Z fn() 2025-12-04T12:52:45.7987219Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7987262Z method(*args, **kwargs) 2025-12-04T12:52:45.7987417Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7987457Z method(*args, **kwargs) 2025-12-04T12:52:45.7987606Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7987643Z with policy(): 2025-12-04T12:52:45.7987808Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7987850Z raise RuntimeError(msg) 2025-12-04T12:52:45.7988273Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7988290Z 2025-12-04T12:52:45.7988364Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7988648Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7988650Z 2025-12-04T12:52:45.7988738Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7988742Z 2025-12-04T12:52:45.7988800Z Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.7988846Z Traceback (most recent call last): 2025-12-04T12:52:45.7989009Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7989052Z getattr(self, test_name)() 2025-12-04T12:52:45.7989209Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7989246Z fn() 2025-12-04T12:52:45.7989397Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7989439Z method(*args, **kwargs) 2025-12-04T12:52:45.7989588Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7989627Z method(*args, **kwargs) 2025-12-04T12:52:45.7989777Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7989817Z with policy(): 2025-12-04T12:52:45.7989968Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7990008Z raise RuntimeError(msg) 2025-12-04T12:52:45.7990396Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7990400Z 2025-12-04T12:52:45.7990474Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7990743Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7990746Z 2025-12-04T12:52:45.7990833Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7990835Z 2025-12-04T12:52:45.7990837Z 2025-12-04T12:52:45.7990933Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.7991021Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.7991256Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-e156b6b4e48e5ac7.xml - 2025-12-04T12:52:45.7991316Z =========================== short test summary info ============================ 2025-12-04T12:52:45.7991599Z FAILED [9.5161s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.7991657Z Traceback (most recent call last): 2025-12-04T12:52:45.7991821Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7991863Z getattr(self, test_name)() 2025-12-04T12:52:45.7992025Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7992058Z fn() 2025-12-04T12:52:45.7992221Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7992270Z method(*args, **kwargs) 2025-12-04T12:52:45.7992421Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7992460Z method(*args, **kwargs) 2025-12-04T12:52:45.7992612Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7992649Z with policy(): 2025-12-04T12:52:45.7992801Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7992842Z raise RuntimeError(msg) 2025-12-04T12:52:45.7993228Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368. 2025-12-04T12:52:45.7993232Z 2025-12-04T12:52:45.7993305Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7993575Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7993577Z 2025-12-04T12:52:45.7993666Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7993668Z 2025-12-04T12:52:45.7993725Z Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.7993770Z Traceback (most recent call last): 2025-12-04T12:52:45.7993933Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.7993975Z getattr(self, test_name)() 2025-12-04T12:52:45.7994135Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.7994169Z fn() 2025-12-04T12:52:45.7994320Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7994359Z method(*args, **kwargs) 2025-12-04T12:52:45.7994511Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.7994551Z method(*args, **kwargs) 2025-12-04T12:52:45.7994701Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.7994736Z with policy(): 2025-12-04T12:52:45.7994909Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.7994952Z raise RuntimeError(msg) 2025-12-04T12:52:45.7995336Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152. 2025-12-04T12:52:45.7995339Z 2025-12-04T12:52:45.7995411Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.7995701Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7995703Z 2025-12-04T12:52:45.7995792Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.7995857Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.7995921Z ======================= 1 failed, 9 deselected in 9.53s ======================== 2025-12-04T12:52:45.7995979Z Got exit code 1 2025-12-04T12:52:45.7996200Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.7996328Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:52:45.7996520Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-b0fce4bab8e79ff7.xml 2025-12-04T12:52:45.7996577Z ============================= test session starts ============================== 2025-12-04T12:52:45.7996691Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.7996735Z cachedir: .pytest_cache 2025-12-04T12:52:45.7996892Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.7996938Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.7996982Z configfile: pytest.ini 2025-12-04T12:52:45.7997143Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.7997218Z collecting ... collected 10 items / 4 deselected / 6 selected 2025-12-04T12:52:45.7997270Z stepcurrent: skipping 4 already run items. 2025-12-04T12:52:45.7997316Z Running 6 items in this shard 2025-12-04T12:52:45.7997318Z 2025-12-04T12:52:45.7997662Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda I1204 12:49:19.064000 500679 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 500748 2025-12-04T12:52:45.7997818Z I1204 12:49:19.066000 500679 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 500749 2025-12-04T12:52:45.7997971Z I1204 12:49:19.066000 500679 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 500750 2025-12-04T12:52:45.7998121Z I1204 12:49:19.067000 500679 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 500751 2025-12-04T12:52:45.7998692Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7998755Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7999255Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7999317Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.7999811Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.7999870Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8000358Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8000446Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8000590Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8000756Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8001048Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8001203Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8001492Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8001619Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8001900Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8002049Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8002326Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8002476Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8002753Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8002891Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8003185Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8003335Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8003850Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8003976Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8004172Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8004571Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8004710Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8004921Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8005089Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8005128Z dist init r=1, world=4 2025-12-04T12:52:45.8005267Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8005426Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8005715Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8005871Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8006156Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8006279Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8006555Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8006706Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8006981Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8007131Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8007427Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8007563Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8007841Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8007990Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8008552Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8008667Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8008861Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8009288Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8009402Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8009616Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8009782Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8009822Z dist init r=3, world=4 2025-12-04T12:52:45.8009960Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8010120Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8010406Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8010560Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8010844Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8010969Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8011244Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8011391Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8011667Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8011826Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8012102Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8012240Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8012525Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8012677Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8013188Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8013323Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8013520Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8013918Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8014034Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8014246Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8014412Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8014451Z dist init r=2, world=4 2025-12-04T12:52:45.8014589Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8014749Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8015035Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8015191Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8015475Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8015599Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8015878Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8016040Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8016313Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8016463Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8016750Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8016886Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8017163Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8017310Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8017844Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8017959Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8018194Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8018593Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8018708Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8018921Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8019084Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8019123Z dist init r=0, world=4 2025-12-04T12:52:45.8019459Z [rank0]:[W1204 12:49:26.147153513 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8019502Z FAILED [8.9164s] [ 16%] 2025-12-04T12:52:45.8019505Z 2025-12-04T12:52:45.8019562Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8019696Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda _ 2025-12-04T12:52:45.8019743Z Traceback (most recent call last): 2025-12-04T12:52:45.8019906Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8019950Z self._join_processes(fn) 2025-12-04T12:52:45.8020123Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8020194Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8020372Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8020417Z raise RuntimeError(error) 2025-12-04T12:52:45.8020498Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8020544Z Traceback (most recent call last): 2025-12-04T12:52:45.8020704Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8020747Z getattr(self, test_name)() 2025-12-04T12:52:45.8020922Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8020960Z fn() 2025-12-04T12:52:45.8021111Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8021153Z method(*args, **kwargs) 2025-12-04T12:52:45.8021302Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8021356Z method(*args, **kwargs) 2025-12-04T12:52:45.8021517Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8021554Z with policy(): 2025-12-04T12:52:45.8021704Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8021746Z raise RuntimeError(msg) 2025-12-04T12:52:45.8022133Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8022138Z 2025-12-04T12:52:45.8022214Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8022485Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8022489Z 2025-12-04T12:52:45.8022578Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8022580Z 2025-12-04T12:52:45.8022582Z 2025-12-04T12:52:45.8022657Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8022745Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8022982Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-b0fce4bab8e79ff7.xml - 2025-12-04T12:52:45.8023043Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8023326Z FAILED [8.9164s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8023373Z Traceback (most recent call last): 2025-12-04T12:52:45.8023536Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8023580Z getattr(self, test_name)() 2025-12-04T12:52:45.8023739Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8023775Z fn() 2025-12-04T12:52:45.8023925Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8023966Z method(*args, **kwargs) 2025-12-04T12:52:45.8024132Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8024175Z method(*args, **kwargs) 2025-12-04T12:52:45.8024325Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8024365Z with policy(): 2025-12-04T12:52:45.8024515Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8024557Z raise RuntimeError(msg) 2025-12-04T12:52:45.8024953Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8024956Z 2025-12-04T12:52:45.8025032Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8025305Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8025334Z 2025-12-04T12:52:45.8025422Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8025486Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8025549Z ======================= 1 failed, 4 deselected in 8.93s ======================== 2025-12-04T12:52:45.8025586Z Got exit code 1 2025-12-04T12:52:45.8025628Z Retrying single test... 2025-12-04T12:52:45.8025820Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-57e00d741dcc51b9.xml 2025-12-04T12:52:45.8025878Z ============================= test session starts ============================== 2025-12-04T12:52:45.8025991Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8026031Z cachedir: .pytest_cache 2025-12-04T12:52:45.8026191Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8026237Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8026277Z configfile: pytest.ini 2025-12-04T12:52:45.8026438Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8026512Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8026775Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8026820Z Running 1 items in this shard 2025-12-04T12:52:45.8026822Z 2025-12-04T12:52:45.8027168Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda I1204 12:49:30.559000 501081 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 501150 2025-12-04T12:52:45.8027326Z I1204 12:49:30.560000 501081 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 501151 2025-12-04T12:52:45.8027479Z I1204 12:49:30.561000 501081 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 501152 2025-12-04T12:52:45.8027629Z I1204 12:49:30.562000 501081 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 501153 2025-12-04T12:52:45.8028136Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8028247Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8028733Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8028813Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8029297Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8029369Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8029855Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8029928Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8030073Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8030237Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8030526Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8030681Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8030966Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8031089Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8031368Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8031516Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8031795Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8031943Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8032216Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8032365Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8032642Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8032792Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8033316Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8033434Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8033630Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8034045Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8034171Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8034382Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8034546Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8034585Z dist init r=2, world=4 2025-12-04T12:52:45.8034725Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8034885Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8035173Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8035327Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8035612Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8035739Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8036015Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8036164Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8036441Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8036602Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8036878Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8037015Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8037294Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8037451Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8037964Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448. 2025-12-04T12:52:45.8038098Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8038324Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8038724Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8038839Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8039255Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8039420Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8039459Z dist init r=3, world=4 2025-12-04T12:52:45.8039596Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8039757Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8040043Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8040197Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8040482Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8040606Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8040885Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8041035Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8041326Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8041476Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8041750Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8041908Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8042185Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8042333Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8042859Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8042989Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8043184Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8043585Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8043702Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8043911Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8044077Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8044116Z dist init r=1, world=4 2025-12-04T12:52:45.8044256Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8044414Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8044699Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8044855Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8045139Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8045263Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8045549Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8045700Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8045977Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8046132Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8046415Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8046551Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8046829Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8046997Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8047509Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8047624Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8047820Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8048269Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8048384Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8048594Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8048757Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8048797Z dist init r=0, world=4 2025-12-04T12:52:45.8049137Z [rank0]:[W1204 12:49:38.838474802 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8049179Z FAILED [8.9135s] [100%] 2025-12-04T12:52:45.8049181Z 2025-12-04T12:52:45.8049237Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8049371Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda _ 2025-12-04T12:52:45.8049419Z Traceback (most recent call last): 2025-12-04T12:52:45.8049595Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8049641Z self._join_processes(fn) 2025-12-04T12:52:45.8049813Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8049870Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8050047Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8050091Z raise RuntimeError(error) 2025-12-04T12:52:45.8050170Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.8050216Z Traceback (most recent call last): 2025-12-04T12:52:45.8050387Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8050432Z getattr(self, test_name)() 2025-12-04T12:52:45.8050590Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8050626Z fn() 2025-12-04T12:52:45.8050778Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8050845Z method(*args, **kwargs) 2025-12-04T12:52:45.8050995Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8051036Z method(*args, **kwargs) 2025-12-04T12:52:45.8051185Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8051223Z with policy(): 2025-12-04T12:52:45.8051375Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8051416Z raise RuntimeError(msg) 2025-12-04T12:52:45.8051805Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448. 2025-12-04T12:52:45.8051809Z 2025-12-04T12:52:45.8051884Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8052155Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8052157Z 2025-12-04T12:52:45.8052245Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8052248Z 2025-12-04T12:52:45.8052249Z 2025-12-04T12:52:45.8052325Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8052413Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8052647Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-57e00d741dcc51b9.xml - 2025-12-04T12:52:45.8052710Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8052993Z FAILED [8.9135s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.8053042Z Traceback (most recent call last): 2025-12-04T12:52:45.8053206Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8053251Z getattr(self, test_name)() 2025-12-04T12:52:45.8053409Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8053455Z fn() 2025-12-04T12:52:45.8053606Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8053649Z method(*args, **kwargs) 2025-12-04T12:52:45.8053800Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8053842Z method(*args, **kwargs) 2025-12-04T12:52:45.8053992Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8054029Z with policy(): 2025-12-04T12:52:45.8054189Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8054232Z raise RuntimeError(msg) 2025-12-04T12:52:45.8054619Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448. 2025-12-04T12:52:45.8054632Z 2025-12-04T12:52:45.8054705Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8054987Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8054989Z 2025-12-04T12:52:45.8055076Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8055142Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8055204Z ======================= 1 failed, 9 deselected in 8.92s ======================== 2025-12-04T12:52:45.8055243Z Got exit code 1 2025-12-04T12:52:45.8055283Z Retrying single test... 2025-12-04T12:52:45.8055471Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-0d51898b7f977c61.xml 2025-12-04T12:52:45.8055529Z ============================= test session starts ============================== 2025-12-04T12:52:45.8055644Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8055684Z cachedir: .pytest_cache 2025-12-04T12:52:45.8055842Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8055887Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8055928Z configfile: pytest.ini 2025-12-04T12:52:45.8056090Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8056165Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8056429Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8056474Z Running 1 items in this shard 2025-12-04T12:52:45.8056476Z 2025-12-04T12:52:45.8056819Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda I1204 12:49:42.111000 501483 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 501552 2025-12-04T12:52:45.8056972Z I1204 12:49:42.112000 501483 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 501553 2025-12-04T12:52:45.8057125Z I1204 12:49:42.113000 501483 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 501554 2025-12-04T12:52:45.8057276Z I1204 12:49:42.113000 501483 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 501555 2025-12-04T12:52:45.8057791Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8057856Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8058414Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8058476Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8058962Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8059045Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8059528Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8059587Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8059732Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8059893Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8060184Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8060339Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8060624Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8060750Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8061031Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8061178Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8061454Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8061600Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8061891Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8062031Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8062307Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8062465Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8062981Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8063096Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8063313Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8063712Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8063826Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8064037Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8064203Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8064245Z dist init r=2, world=4 2025-12-04T12:52:45.8064382Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8064543Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8064829Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8064983Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8065265Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8065391Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8065669Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8065816Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8066101Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8066247Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8066524Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8066659Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8066945Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8067094Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8067605Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8067742Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8067938Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8068367Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8068480Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8068692Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8068855Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8068895Z dist init r=1, world=4 2025-12-04T12:52:45.8069031Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8069191Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8069478Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8069633Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8069916Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8070041Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8070335Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8070481Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8070757Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8070906Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8071194Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8071332Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8071609Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8071791Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8072302Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8072416Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8072615Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8073010Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8073126Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8073337Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8073502Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8073540Z dist init r=0, world=4 2025-12-04T12:52:45.8073678Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8073838Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8074125Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8074279Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8074572Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8074697Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8074975Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8075123Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8075408Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8075555Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8075832Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8075986Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8076263Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8076412Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8076924Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8077040Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8077237Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8077633Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8077747Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8077959Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8078122Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8078199Z dist init r=3, world=4 2025-12-04T12:52:45.8078535Z [rank0]:[W1204 12:49:49.052112048 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8078577Z FAILED [8.9132s] [100%] 2025-12-04T12:52:45.8078579Z 2025-12-04T12:52:45.8078635Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8078785Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda _ 2025-12-04T12:52:45.8078832Z Traceback (most recent call last): 2025-12-04T12:52:45.8078993Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8079039Z self._join_processes(fn) 2025-12-04T12:52:45.8079212Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8079266Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8079457Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8079501Z raise RuntimeError(error) 2025-12-04T12:52:45.8079582Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8079627Z Traceback (most recent call last): 2025-12-04T12:52:45.8079788Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8079831Z getattr(self, test_name)() 2025-12-04T12:52:45.8080000Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8080048Z fn() 2025-12-04T12:52:45.8080198Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8080239Z method(*args, **kwargs) 2025-12-04T12:52:45.8080390Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8080432Z method(*args, **kwargs) 2025-12-04T12:52:45.8080581Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8080619Z with policy(): 2025-12-04T12:52:45.8080772Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8080814Z raise RuntimeError(msg) 2025-12-04T12:52:45.8081202Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8081206Z 2025-12-04T12:52:45.8081281Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8081552Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8081554Z 2025-12-04T12:52:45.8081642Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8081645Z 2025-12-04T12:52:45.8081704Z Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.8081750Z Traceback (most recent call last): 2025-12-04T12:52:45.8081922Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8081964Z getattr(self, test_name)() 2025-12-04T12:52:45.8082123Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8082156Z fn() 2025-12-04T12:52:45.8082309Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8082349Z method(*args, **kwargs) 2025-12-04T12:52:45.8082500Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8082542Z method(*args, **kwargs) 2025-12-04T12:52:45.8082702Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8082740Z with policy(): 2025-12-04T12:52:45.8082892Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8082934Z raise RuntimeError(msg) 2025-12-04T12:52:45.8083329Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8083331Z 2025-12-04T12:52:45.8083406Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8083674Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8083677Z 2025-12-04T12:52:45.8083764Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8083777Z 2025-12-04T12:52:45.8083778Z 2025-12-04T12:52:45.8083865Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8083952Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8084185Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-0d51898b7f977c61.xml - 2025-12-04T12:52:45.8084247Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8084531Z FAILED [8.9132s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8084576Z Traceback (most recent call last): 2025-12-04T12:52:45.8084739Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8084782Z getattr(self, test_name)() 2025-12-04T12:52:45.8084943Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8084978Z fn() 2025-12-04T12:52:45.8085128Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8085167Z method(*args, **kwargs) 2025-12-04T12:52:45.8085319Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8085359Z method(*args, **kwargs) 2025-12-04T12:52:45.8085512Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8085549Z with policy(): 2025-12-04T12:52:45.8085701Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8085743Z raise RuntimeError(msg) 2025-12-04T12:52:45.8086129Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8086131Z 2025-12-04T12:52:45.8086207Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8086478Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8086490Z 2025-12-04T12:52:45.8086578Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8086580Z 2025-12-04T12:52:45.8086640Z Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.8086687Z Traceback (most recent call last): 2025-12-04T12:52:45.8086850Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8086893Z getattr(self, test_name)() 2025-12-04T12:52:45.8087050Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8087088Z fn() 2025-12-04T12:52:45.8087250Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8087291Z method(*args, **kwargs) 2025-12-04T12:52:45.8087443Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8087482Z method(*args, **kwargs) 2025-12-04T12:52:45.8087631Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8087696Z with policy(): 2025-12-04T12:52:45.8087847Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8087887Z raise RuntimeError(msg) 2025-12-04T12:52:45.8088337Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8088340Z 2025-12-04T12:52:45.8088412Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8088682Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8088685Z 2025-12-04T12:52:45.8088770Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8088837Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8088900Z ======================= 1 failed, 9 deselected in 8.92s ======================== 2025-12-04T12:52:45.8088939Z Got exit code 1 2025-12-04T12:52:45.8089159Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda 2025-12-04T12:52:45.8089289Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:52:45.8089477Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-959385e2c420bf4a.xml 2025-12-04T12:52:45.8089536Z ============================= test session starts ============================== 2025-12-04T12:52:45.8089649Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8089691Z cachedir: .pytest_cache 2025-12-04T12:52:45.8089852Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8089899Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8089942Z configfile: pytest.ini 2025-12-04T12:52:45.8090103Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8090175Z collecting ... collected 10 items / 5 deselected / 5 selected 2025-12-04T12:52:45.8090228Z stepcurrent: skipping 5 already run items. 2025-12-04T12:52:45.8090287Z Running 5 items in this shard 2025-12-04T12:52:45.8090289Z 2025-12-04T12:52:45.8090633Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda I1204 12:49:53.530000 501885 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 501954 2025-12-04T12:52:45.8090792Z I1204 12:49:53.531000 501885 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 501955 2025-12-04T12:52:45.8090944Z I1204 12:49:53.532000 501885 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 501956 2025-12-04T12:52:45.8091111Z I1204 12:49:53.532000 501885 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 501957 2025-12-04T12:52:45.8091608Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8091681Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8092183Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8092244Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8092729Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8092789Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8093275Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8093332Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8093476Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8093639Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8093927Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8094083Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8094370Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8094493Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8094782Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8094931Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8095212Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8095369Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8095646Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8095784Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8096061Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8096229Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8096742Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448. 2025-12-04T12:52:45.8096859Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8097054Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8097457Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8097577Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8097786Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8097954Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8097993Z dist init r=3, world=4 2025-12-04T12:52:45.8098130Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8098318Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8098607Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8098760Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8099060Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8099187Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8099464Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8099612Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8099901Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8100051Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8100326Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8100486Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8100765Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8100912Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8101427Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8101543Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8101737Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8102139Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8102254Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8102466Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8102630Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8102672Z dist init r=1, world=4 2025-12-04T12:52:45.8102809Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8102971Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8103264Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8103418Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8103704Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8103828Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8104123Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8104272Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8104550Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8104714Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8104988Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8105127Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8105406Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8105556Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8106066Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8106184Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8106378Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8106779Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8106894Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8107105Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8107273Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8107310Z dist init r=2, world=4 2025-12-04T12:52:45.8107460Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8107617Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8107905Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8108058Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8108391Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8108517Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8108793Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8108966Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8109241Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8109389Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8109665Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8109802Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8110081Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8110230Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8110742Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8110854Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8111049Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8111450Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8111569Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8111795Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8111960Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8112001Z dist init r=0, world=4 2025-12-04T12:52:45.8112338Z [rank0]:[W1204 12:50:00.601664434 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8112378Z FAILED [9.0149s] [ 20%] 2025-12-04T12:52:45.8112381Z 2025-12-04T12:52:45.8112447Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8112584Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda _ 2025-12-04T12:52:45.8112631Z Traceback (most recent call last): 2025-12-04T12:52:45.8112796Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8112839Z self._join_processes(fn) 2025-12-04T12:52:45.8113011Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8113085Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8113263Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8113308Z raise RuntimeError(error) 2025-12-04T12:52:45.8113388Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8113435Z Traceback (most recent call last): 2025-12-04T12:52:45.8113597Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8113640Z getattr(self, test_name)() 2025-12-04T12:52:45.8113801Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8113836Z fn() 2025-12-04T12:52:45.8113988Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8114032Z method(*args, **kwargs) 2025-12-04T12:52:45.8114183Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8114225Z method(*args, **kwargs) 2025-12-04T12:52:45.8114376Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8114416Z with policy(): 2025-12-04T12:52:45.8114566Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8114609Z raise RuntimeError(msg) 2025-12-04T12:52:45.8114994Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8114998Z 2025-12-04T12:52:45.8115073Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8115344Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8115347Z 2025-12-04T12:52:45.8115436Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8115439Z 2025-12-04T12:52:45.8115500Z Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.8115544Z Traceback (most recent call last): 2025-12-04T12:52:45.8115716Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8115760Z getattr(self, test_name)() 2025-12-04T12:52:45.8115920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8115957Z fn() 2025-12-04T12:52:45.8116110Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8116149Z method(*args, **kwargs) 2025-12-04T12:52:45.8116309Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8116348Z method(*args, **kwargs) 2025-12-04T12:52:45.8116500Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8116537Z with policy(): 2025-12-04T12:52:45.8116692Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8116732Z raise RuntimeError(msg) 2025-12-04T12:52:45.8117130Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448. 2025-12-04T12:52:45.8117144Z 2025-12-04T12:52:45.8117220Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8117488Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8117490Z 2025-12-04T12:52:45.8117578Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8117581Z 2025-12-04T12:52:45.8117583Z 2025-12-04T12:52:45.8117657Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8117747Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8117981Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-959385e2c420bf4a.xml - 2025-12-04T12:52:45.8118043Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8118358Z FAILED [9.0149s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8118408Z Traceback (most recent call last): 2025-12-04T12:52:45.8118573Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8118615Z getattr(self, test_name)() 2025-12-04T12:52:45.8118775Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8118811Z fn() 2025-12-04T12:52:45.8118963Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8119002Z method(*args, **kwargs) 2025-12-04T12:52:45.8119152Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8119192Z method(*args, **kwargs) 2025-12-04T12:52:45.8119341Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8119379Z with policy(): 2025-12-04T12:52:45.8119550Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8119592Z raise RuntimeError(msg) 2025-12-04T12:52:45.8119981Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8119985Z 2025-12-04T12:52:45.8120059Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8120342Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8120345Z 2025-12-04T12:52:45.8120432Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8120434Z 2025-12-04T12:52:45.8120492Z Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.8120538Z Traceback (most recent call last): 2025-12-04T12:52:45.8120700Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8120767Z getattr(self, test_name)() 2025-12-04T12:52:45.8120924Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8120959Z fn() 2025-12-04T12:52:45.8121108Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8121148Z method(*args, **kwargs) 2025-12-04T12:52:45.8121298Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8121337Z method(*args, **kwargs) 2025-12-04T12:52:45.8121487Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8121527Z with policy(): 2025-12-04T12:52:45.8121676Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8121722Z raise RuntimeError(msg) 2025-12-04T12:52:45.8122103Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448. 2025-12-04T12:52:45.8122107Z 2025-12-04T12:52:45.8122181Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8124287Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8124290Z 2025-12-04T12:52:45.8124385Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8124449Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8124518Z ======================= 1 failed, 5 deselected in 9.02s ======================== 2025-12-04T12:52:45.8124556Z Got exit code 1 2025-12-04T12:52:45.8124597Z Retrying single test... 2025-12-04T12:52:45.8124789Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-bf894163f060bb1c.xml 2025-12-04T12:52:45.8124847Z ============================= test session starts ============================== 2025-12-04T12:52:45.8124961Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8125002Z cachedir: .pytest_cache 2025-12-04T12:52:45.8125182Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8125229Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8125270Z configfile: pytest.ini 2025-12-04T12:52:45.8125433Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8125508Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8125771Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8125815Z Running 1 items in this shard 2025-12-04T12:52:45.8125831Z 2025-12-04T12:52:45.8126178Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda I1204 12:50:04.916000 502287 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 502356 2025-12-04T12:52:45.8126333Z I1204 12:50:04.917000 502287 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 502357 2025-12-04T12:52:45.8126501Z I1204 12:50:04.918000 502287 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 502358 2025-12-04T12:52:45.8126664Z I1204 12:50:04.919000 502287 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 502359 2025-12-04T12:52:45.8127169Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8127232Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8127723Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8127786Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8128310Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8128368Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8128855Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8128914Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8129058Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8129222Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8129538Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8129694Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8129981Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8130108Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8130398Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8130547Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8130825Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8130998Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8131273Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8131411Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8131690Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8131839Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8132356Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8132473Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8132669Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8133068Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8133184Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8133395Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8133561Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8133600Z dist init r=2, world=4 2025-12-04T12:52:45.8133739Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8133914Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8134204Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8134360Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8134654Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8134780Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8135056Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8135213Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8135511Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8135658Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8135933Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8136069Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8136349Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8136499Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8137012Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2464153600 and is now 3196059648. 2025-12-04T12:52:45.8137127Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8137321Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8137721Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8137835Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8138047Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8138271Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8138313Z dist init r=0, world=4 2025-12-04T12:52:45.8138451Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8138611Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8138914Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8139067Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8139351Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8139486Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8139775Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8139922Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8140197Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8140344Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8140617Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8140756Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8141034Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8141184Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8141695Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8141811Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8142005Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8142409Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8142523Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8142732Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8142897Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8142935Z dist init r=3, world=4 2025-12-04T12:52:45.8143075Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8143245Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8143533Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8143686Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8143990Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8144113Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8144388Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8144536Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8144811Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8144959Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8145234Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8145369Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8145648Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8145795Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8146307Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8146421Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8146625Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8147020Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8147135Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8147356Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8147519Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8147558Z dist init r=1, world=4 2025-12-04T12:52:45.8147895Z [rank0]:[W1204 12:50:12.922941439 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8147945Z FAILED [8.9130s] [100%] 2025-12-04T12:52:45.8147957Z 2025-12-04T12:52:45.8148014Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8148182Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda _ 2025-12-04T12:52:45.8148229Z Traceback (most recent call last): 2025-12-04T12:52:45.8148393Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8148437Z self._join_processes(fn) 2025-12-04T12:52:45.8148609Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8148665Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8148841Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8148886Z raise RuntimeError(error) 2025-12-04T12:52:45.8148967Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8149012Z Traceback (most recent call last): 2025-12-04T12:52:45.8149171Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8149214Z getattr(self, test_name)() 2025-12-04T12:52:45.8149372Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8149407Z fn() 2025-12-04T12:52:45.8149558Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8149600Z method(*args, **kwargs) 2025-12-04T12:52:45.8149750Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8149791Z method(*args, **kwargs) 2025-12-04T12:52:45.8149941Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8149978Z with policy(): 2025-12-04T12:52:45.8150129Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8150171Z raise RuntimeError(msg) 2025-12-04T12:52:45.8150559Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2464153600 and is now 3196059648. 2025-12-04T12:52:45.8150575Z 2025-12-04T12:52:45.8150650Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8150919Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8150923Z 2025-12-04T12:52:45.8151010Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8151012Z 2025-12-04T12:52:45.8151072Z Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.8151116Z Traceback (most recent call last): 2025-12-04T12:52:45.8151293Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8151335Z getattr(self, test_name)() 2025-12-04T12:52:45.8151495Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8151529Z fn() 2025-12-04T12:52:45.8151679Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8151741Z method(*args, **kwargs) 2025-12-04T12:52:45.8151903Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8151943Z method(*args, **kwargs) 2025-12-04T12:52:45.8152091Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8152128Z with policy(): 2025-12-04T12:52:45.8152279Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8152320Z raise RuntimeError(msg) 2025-12-04T12:52:45.8152708Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8152711Z 2025-12-04T12:52:45.8152785Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8153054Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8153057Z 2025-12-04T12:52:45.8153144Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8153147Z 2025-12-04T12:52:45.8153149Z 2025-12-04T12:52:45.8153225Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8153312Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8153547Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-bf894163f060bb1c.xml - 2025-12-04T12:52:45.8153607Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8153890Z FAILED [8.9130s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8153936Z Traceback (most recent call last): 2025-12-04T12:52:45.8154099Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8154141Z getattr(self, test_name)() 2025-12-04T12:52:45.8154300Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8154334Z fn() 2025-12-04T12:52:45.8154496Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8154535Z method(*args, **kwargs) 2025-12-04T12:52:45.8154688Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8154727Z method(*args, **kwargs) 2025-12-04T12:52:45.8154877Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8154913Z with policy(): 2025-12-04T12:52:45.8155075Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8155115Z raise RuntimeError(msg) 2025-12-04T12:52:45.8155501Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2464153600 and is now 3196059648. 2025-12-04T12:52:45.8155504Z 2025-12-04T12:52:45.8155587Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8155865Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8155867Z 2025-12-04T12:52:45.8155954Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8155956Z 2025-12-04T12:52:45.8156016Z Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.8156062Z Traceback (most recent call last): 2025-12-04T12:52:45.8156223Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8156266Z getattr(self, test_name)() 2025-12-04T12:52:45.8156424Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8156458Z fn() 2025-12-04T12:52:45.8156608Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8156649Z method(*args, **kwargs) 2025-12-04T12:52:45.8156799Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8156838Z method(*args, **kwargs) 2025-12-04T12:52:45.8156988Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8157024Z with policy(): 2025-12-04T12:52:45.8157175Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8157215Z raise RuntimeError(msg) 2025-12-04T12:52:45.8157602Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8157606Z 2025-12-04T12:52:45.8157677Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8157944Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8157947Z 2025-12-04T12:52:45.8158033Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8158097Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8158216Z ======================= 1 failed, 9 deselected in 8.92s ======================== 2025-12-04T12:52:45.8158254Z Got exit code 1 2025-12-04T12:52:45.8158293Z Retrying single test... 2025-12-04T12:52:45.8158484Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a6fd76a4add03bc4.xml 2025-12-04T12:52:45.8158545Z ============================= test session starts ============================== 2025-12-04T12:52:45.8158655Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8158697Z cachedir: .pytest_cache 2025-12-04T12:52:45.8158868Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8158916Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8158955Z configfile: pytest.ini 2025-12-04T12:52:45.8159118Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8159191Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8159453Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8159522Z Running 1 items in this shard 2025-12-04T12:52:45.8159524Z 2025-12-04T12:52:45.8159868Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda I1204 12:50:16.506000 502689 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 502758 2025-12-04T12:52:45.8160023Z I1204 12:50:16.507000 502689 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 502759 2025-12-04T12:52:45.8160176Z I1204 12:50:16.508000 502689 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 502760 2025-12-04T12:52:45.8160326Z I1204 12:50:16.509000 502689 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 502761 2025-12-04T12:52:45.8160821Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8160884Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8161371Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8161431Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8161917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8161976Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8162470Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8162527Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8162671Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8162833Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8163134Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8163288Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8163574Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8163699Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8163993Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8164141Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8164418Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8164566Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8164839Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8164978Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8165256Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8165404Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8165918Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8166035Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8166231Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8166632Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8166755Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8166967Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8167131Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8167170Z dist init r=1, world=4 2025-12-04T12:52:45.8167307Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8167487Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8167774Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8167927Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8168256Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8168380Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8168657Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8168804Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8169081Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8169228Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8169503Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8169640Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8169916Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8170064Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8170578Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8170694Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8170887Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8171308Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8171425Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8171635Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8171813Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8171852Z dist init r=2, world=4 2025-12-04T12:52:45.8171991Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8172150Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8172452Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8172617Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8172904Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8173028Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8173304Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8173455Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8173731Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8173879Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8174154Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8174290Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8174568Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8174715Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8175235Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8175348Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8175544Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8175941Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8176065Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8176276Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8176439Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8176487Z dist init r=3, world=4 2025-12-04T12:52:45.8176632Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8176791Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8177077Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8177230Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8177515Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8177639Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8177915Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8178062Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8178379Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8178525Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8178801Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8178937Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8179214Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8179362Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8179884Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8179999Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8180205Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8180602Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8180716Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8180938Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8181114Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8181152Z dist init r=0, world=4 2025-12-04T12:52:45.8181488Z [rank0]:[W1204 12:50:24.070318785 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8181527Z FAILED [9.1173s] [100%] 2025-12-04T12:52:45.8181529Z 2025-12-04T12:52:45.8181585Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8181718Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda _ 2025-12-04T12:52:45.8181766Z Traceback (most recent call last): 2025-12-04T12:52:45.8181929Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8181973Z self._join_processes(fn) 2025-12-04T12:52:45.8182144Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8182200Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8182378Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8182422Z raise RuntimeError(error) 2025-12-04T12:52:45.8182506Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8182550Z Traceback (most recent call last): 2025-12-04T12:52:45.8182711Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8182754Z getattr(self, test_name)() 2025-12-04T12:52:45.8182912Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8182946Z fn() 2025-12-04T12:52:45.8183099Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8183140Z method(*args, **kwargs) 2025-12-04T12:52:45.8183290Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8183329Z method(*args, **kwargs) 2025-12-04T12:52:45.8183499Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8183536Z with policy(): 2025-12-04T12:52:45.8183688Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8183730Z raise RuntimeError(msg) 2025-12-04T12:52:45.8184117Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8184119Z 2025-12-04T12:52:45.8184204Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8184476Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8184478Z 2025-12-04T12:52:45.8184568Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8184582Z 2025-12-04T12:52:45.8184583Z 2025-12-04T12:52:45.8184659Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8184757Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8184992Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a6fd76a4add03bc4.xml - 2025-12-04T12:52:45.8185053Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8185335Z FAILED [9.1173s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8185383Z Traceback (most recent call last): 2025-12-04T12:52:45.8185546Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8185590Z getattr(self, test_name)() 2025-12-04T12:52:45.8185751Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8185784Z fn() 2025-12-04T12:52:45.8185936Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8185976Z method(*args, **kwargs) 2025-12-04T12:52:45.8186127Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8186166Z method(*args, **kwargs) 2025-12-04T12:52:45.8186316Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8186353Z with policy(): 2025-12-04T12:52:45.8186504Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8186545Z raise RuntimeError(msg) 2025-12-04T12:52:45.8186934Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8186936Z 2025-12-04T12:52:45.8187010Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8187281Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8187283Z 2025-12-04T12:52:45.8187380Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8187445Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8187509Z ======================= 1 failed, 9 deselected in 9.13s ======================== 2025-12-04T12:52:45.8187547Z Got exit code 1 2025-12-04T12:52:45.8187766Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda 2025-12-04T12:52:45.8187894Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:52:45.8188093Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-fc624f2ff706e807.xml 2025-12-04T12:52:45.8188184Z ============================= test session starts ============================== 2025-12-04T12:52:45.8188299Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8188340Z cachedir: .pytest_cache 2025-12-04T12:52:45.8188497Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8188576Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8188617Z configfile: pytest.ini 2025-12-04T12:52:45.8188777Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8188850Z collecting ... collected 10 items / 6 deselected / 4 selected 2025-12-04T12:52:45.8188902Z stepcurrent: skipping 6 already run items. 2025-12-04T12:52:45.8188947Z Running 4 items in this shard 2025-12-04T12:52:45.8188949Z 2025-12-04T12:52:45.8189290Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda I1204 12:50:28.562000 503091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 503160 2025-12-04T12:52:45.8189444Z I1204 12:50:28.563000 503091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 503161 2025-12-04T12:52:45.8189598Z I1204 12:50:28.564000 503091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 503162 2025-12-04T12:52:45.8189750Z I1204 12:50:28.565000 503091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 503163 2025-12-04T12:52:45.8190247Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8190310Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8190795Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8190857Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8191340Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8191414Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8191902Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8191961Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8192104Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8192282Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8192572Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8192726Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8193031Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8193157Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8193435Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8193583Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8193859Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8194006Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8194282Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8194420Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8194703Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8194852Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8195367Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8195484Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8195679Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8196085Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8196202Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8196412Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8196585Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8196624Z dist init r=1, world=4 2025-12-04T12:52:45.8196763Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8196922Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8197222Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8197394Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8197677Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8197802Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8198079Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8198267Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8198541Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8198689Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8198964Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8199099Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8199381Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8199530Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8200057Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8200171Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8200367Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8200761Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8200886Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8201097Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8201259Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8201309Z dist init r=3, world=4 2025-12-04T12:52:45.8201459Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8201618Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8201906Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8202061Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8202344Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8202468Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8202744Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8202892Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8203168Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8203314Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8203589Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8203726Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8204005Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8204153Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8204670Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8204786Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8204989Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8205383Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8205496Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8205714Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8205889Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8205927Z dist init r=0, world=4 2025-12-04T12:52:45.8206065Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8206224Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8206512Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8206665Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8206949Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8207075Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8207348Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8207496Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8207769Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8207918Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8208232Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8208369Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8208660Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8208809Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8209331Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8209444Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8209639Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8210033Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8210168Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8210379Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8210542Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8210581Z dist init r=2, world=4 2025-12-04T12:52:45.8210919Z [rank0]:[W1204 12:50:36.713346171 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8210961Z FAILED [9.3136s] [ 25%] 2025-12-04T12:52:45.8210963Z 2025-12-04T12:52:45.8211019Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8211153Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda _ 2025-12-04T12:52:45.8211199Z Traceback (most recent call last): 2025-12-04T12:52:45.8211362Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8211407Z self._join_processes(fn) 2025-12-04T12:52:45.8211579Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8211636Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8211813Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8211858Z raise RuntimeError(error) 2025-12-04T12:52:45.8211938Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8211983Z Traceback (most recent call last): 2025-12-04T12:52:45.8212145Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8212188Z getattr(self, test_name)() 2025-12-04T12:52:45.8212345Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8212380Z fn() 2025-12-04T12:52:45.8212539Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8212580Z method(*args, **kwargs) 2025-12-04T12:52:45.8212730Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8212771Z method(*args, **kwargs) 2025-12-04T12:52:45.8212920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8212957Z with policy(): 2025-12-04T12:52:45.8213110Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8213151Z raise RuntimeError(msg) 2025-12-04T12:52:45.8213549Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8213552Z 2025-12-04T12:52:45.8213627Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8213910Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8213924Z 2025-12-04T12:52:45.8214012Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8214014Z 2025-12-04T12:52:45.8214074Z Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8214118Z Traceback (most recent call last): 2025-12-04T12:52:45.8214282Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8214323Z getattr(self, test_name)() 2025-12-04T12:52:45.8214483Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8214517Z fn() 2025-12-04T12:52:45.8214668Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8214709Z method(*args, **kwargs) 2025-12-04T12:52:45.8214859Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8214897Z method(*args, **kwargs) 2025-12-04T12:52:45.8215047Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8215083Z with policy(): 2025-12-04T12:52:45.8215237Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8215277Z raise RuntimeError(msg) 2025-12-04T12:52:45.8215664Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8215668Z 2025-12-04T12:52:45.8215742Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8216009Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8216011Z 2025-12-04T12:52:45.8216100Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8216102Z 2025-12-04T12:52:45.8216159Z Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.8216204Z Traceback (most recent call last): 2025-12-04T12:52:45.8216374Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8216417Z getattr(self, test_name)() 2025-12-04T12:52:45.8216574Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8216609Z fn() 2025-12-04T12:52:45.8216759Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8216798Z method(*args, **kwargs) 2025-12-04T12:52:45.8216947Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8216997Z method(*args, **kwargs) 2025-12-04T12:52:45.8217147Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8217183Z with policy(): 2025-12-04T12:52:45.8217334Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8217375Z raise RuntimeError(msg) 2025-12-04T12:52:45.8217758Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8217782Z 2025-12-04T12:52:45.8217854Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8218122Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8218124Z 2025-12-04T12:52:45.8218248Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8218250Z 2025-12-04T12:52:45.8218253Z 2025-12-04T12:52:45.8218329Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8218417Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8218651Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-fc624f2ff706e807.xml - 2025-12-04T12:52:45.8218712Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8218993Z FAILED [9.3136s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8219039Z Traceback (most recent call last): 2025-12-04T12:52:45.8219201Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8219244Z getattr(self, test_name)() 2025-12-04T12:52:45.8219403Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8219438Z fn() 2025-12-04T12:52:45.8219588Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8219629Z method(*args, **kwargs) 2025-12-04T12:52:45.8219778Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8219817Z method(*args, **kwargs) 2025-12-04T12:52:45.8219966Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8220010Z with policy(): 2025-12-04T12:52:45.8220163Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8220217Z raise RuntimeError(msg) 2025-12-04T12:52:45.8220601Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8220605Z 2025-12-04T12:52:45.8220677Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8220955Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8220957Z 2025-12-04T12:52:45.8221043Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8221045Z 2025-12-04T12:52:45.8221104Z Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8221149Z Traceback (most recent call last): 2025-12-04T12:52:45.8221312Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8221368Z getattr(self, test_name)() 2025-12-04T12:52:45.8221541Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8221574Z fn() 2025-12-04T12:52:45.8221725Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8221764Z method(*args, **kwargs) 2025-12-04T12:52:45.8221914Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8221952Z method(*args, **kwargs) 2025-12-04T12:52:45.8222102Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8222140Z with policy(): 2025-12-04T12:52:45.8222290Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8222332Z raise RuntimeError(msg) 2025-12-04T12:52:45.8222717Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8222719Z 2025-12-04T12:52:45.8222791Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8223056Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8223058Z 2025-12-04T12:52:45.8223145Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8223147Z 2025-12-04T12:52:45.8223204Z Process 2 exited with error code 10 and exception: 2025-12-04T12:52:45.8223251Z Traceback (most recent call last): 2025-12-04T12:52:45.8223412Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8223454Z getattr(self, test_name)() 2025-12-04T12:52:45.8223613Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8223646Z fn() 2025-12-04T12:52:45.8223797Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8223836Z method(*args, **kwargs) 2025-12-04T12:52:45.8223985Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8224036Z method(*args, **kwargs) 2025-12-04T12:52:45.8224185Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8224223Z with policy(): 2025-12-04T12:52:45.8224374Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8224414Z raise RuntimeError(msg) 2025-12-04T12:52:45.8224809Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8224812Z 2025-12-04T12:52:45.8224884Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8225151Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8225163Z 2025-12-04T12:52:45.8225248Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8225322Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8225387Z ======================= 1 failed, 6 deselected in 9.32s ======================== 2025-12-04T12:52:45.8225423Z Got exit code 1 2025-12-04T12:52:45.8225463Z Retrying single test... 2025-12-04T12:52:45.8225654Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-b62b5adc14a929a0.xml 2025-12-04T12:52:45.8225711Z ============================= test session starts ============================== 2025-12-04T12:52:45.8225822Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8225863Z cachedir: .pytest_cache 2025-12-04T12:52:45.8226020Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8226067Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8226108Z configfile: pytest.ini 2025-12-04T12:52:45.8226271Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8226343Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8226605Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8226648Z Running 1 items in this shard 2025-12-04T12:52:45.8226650Z 2025-12-04T12:52:45.8226993Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda I1204 12:50:40.280000 503493 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 503562 2025-12-04T12:52:45.8227149Z I1204 12:50:40.282000 503493 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 503563 2025-12-04T12:52:45.8227302Z I1204 12:50:40.282000 503493 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 503564 2025-12-04T12:52:45.8227454Z I1204 12:50:40.283000 503493 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 503565 2025-12-04T12:52:45.8227959Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8228021Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8228546Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8228610Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8229115Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8229173Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8229660Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8229742Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8229886Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8230047Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8230336Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8230491Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8230776Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8230903Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8231179Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8231329Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8231603Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8231753Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8232029Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8232165Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8232453Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8232601Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8233126Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8233241Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8233437Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8233835Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8233968Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8234182Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8234346Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8234385Z dist init r=1, world=4 2025-12-04T12:52:45.8234525Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8234685Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8234974Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8235128Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8235413Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8235539Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8235816Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8235963Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8236240Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8236387Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8236673Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8236810Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8237088Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8237236Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8237756Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8237870Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8238085Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8238504Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8238618Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8238830Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8238994Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8239033Z dist init r=0, world=4 2025-12-04T12:52:45.8239170Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8239328Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8239614Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8239767Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8240050Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8240176Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8240451Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8240599Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8240888Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8241036Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8241314Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8241450Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8241738Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8241887Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8242395Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8242532Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8242728Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8243122Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8243235Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8243448Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8243611Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8243650Z dist init r=3, world=4 2025-12-04T12:52:45.8243787Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8243947Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8244233Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8244388Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8244671Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8244795Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8245085Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8245232Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8245508Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8245656Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8245944Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8246081Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8246357Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8246524Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8247033Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8247148Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8247343Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8247738Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8247852Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8248064Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8248279Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8248317Z dist init r=2, world=4 2025-12-04T12:52:45.8248652Z [rank0]:[W1204 12:50:47.297044670 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8248692Z FAILED [8.8136s] [100%] 2025-12-04T12:52:45.8248694Z 2025-12-04T12:52:45.8248751Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8248883Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda _ 2025-12-04T12:52:45.8248931Z Traceback (most recent call last): 2025-12-04T12:52:45.8249093Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8249139Z self._join_processes(fn) 2025-12-04T12:52:45.8249344Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8249398Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8249577Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8249622Z raise RuntimeError(error) 2025-12-04T12:52:45.8249703Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8249748Z Traceback (most recent call last): 2025-12-04T12:52:45.8249923Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8249966Z getattr(self, test_name)() 2025-12-04T12:52:45.8250124Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8250159Z fn() 2025-12-04T12:52:45.8250311Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8250351Z method(*args, **kwargs) 2025-12-04T12:52:45.8250519Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8250572Z method(*args, **kwargs) 2025-12-04T12:52:45.8250722Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8250760Z with policy(): 2025-12-04T12:52:45.8250912Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8250953Z raise RuntimeError(msg) 2025-12-04T12:52:45.8251339Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8251341Z 2025-12-04T12:52:45.8251417Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8251685Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8251687Z 2025-12-04T12:52:45.8251776Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8251778Z 2025-12-04T12:52:45.8251780Z 2025-12-04T12:52:45.8251855Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8251943Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8252175Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-b62b5adc14a929a0.xml - 2025-12-04T12:52:45.8252236Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8252517Z FAILED [8.8136s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8252564Z Traceback (most recent call last): 2025-12-04T12:52:45.8252727Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8252770Z getattr(self, test_name)() 2025-12-04T12:52:45.8252930Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8252963Z fn() 2025-12-04T12:52:45.8253125Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8253165Z method(*args, **kwargs) 2025-12-04T12:52:45.8253315Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8253357Z method(*args, **kwargs) 2025-12-04T12:52:45.8253506Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8253542Z with policy(): 2025-12-04T12:52:45.8253695Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8253734Z raise RuntimeError(msg) 2025-12-04T12:52:45.8254128Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8254131Z 2025-12-04T12:52:45.8254204Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8254483Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8254495Z 2025-12-04T12:52:45.8254583Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8254645Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8254708Z ======================= 1 failed, 9 deselected in 8.82s ======================== 2025-12-04T12:52:45.8254745Z Got exit code 1 2025-12-04T12:52:45.8254785Z Retrying single test... 2025-12-04T12:52:45.8254974Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-687e88c5a858511f.xml 2025-12-04T12:52:45.8255064Z ============================= test session starts ============================== 2025-12-04T12:52:45.8255220Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8255264Z cachedir: .pytest_cache 2025-12-04T12:52:45.8255422Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8255469Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8255509Z configfile: pytest.ini 2025-12-04T12:52:45.8255674Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8255746Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8256009Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8256052Z Running 1 items in this shard 2025-12-04T12:52:45.8256054Z 2025-12-04T12:52:45.8256394Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda I1204 12:50:51.842000 503895 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 503964 2025-12-04T12:52:45.8256550Z I1204 12:50:51.844000 503895 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 503965 2025-12-04T12:52:45.8256705Z I1204 12:50:51.844000 503895 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 503966 2025-12-04T12:52:45.8256857Z I1204 12:50:51.845000 503895 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 503967 2025-12-04T12:52:45.8257365Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8257431Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8257926Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8257990Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8258533Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8258627Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8259109Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8259166Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8259311Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8259473Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8259764Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8259921Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8260206Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8260331Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8260608Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8260757Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8261032Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8261181Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8261468Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8261605Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8261882Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8262031Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8262570Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8262686Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8262881Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8263306Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8263420Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8263631Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8263795Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8263839Z dist init r=2, world=4 2025-12-04T12:52:45.8263978Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8264139Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8264425Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8264580Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8264866Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8264991Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8265267Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8265415Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8265691Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8265847Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8266121Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8266260Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8266547Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8266695Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8267206Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8267342Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8267535Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8267930Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8268044Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8268301Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8268466Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8268504Z dist init r=3, world=4 2025-12-04T12:52:45.8268644Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8268801Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8269090Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8269244Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8269531Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8269657Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8269933Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8270092Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8270365Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8270516Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8270801Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8270937Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8271213Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8271372Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8271900Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8272013Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8272208Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8272603Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8272717Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8272928Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8273090Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8273128Z dist init r=0, world=4 2025-12-04T12:52:45.8273266Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8273426Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8273712Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8273866Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8274151Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8274285Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8274560Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8274708Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8274994Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8275140Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8275415Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8275551Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8275851Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8275999Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8276511Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8276625Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8276820Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8277215Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8277329Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8277539Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8277703Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8277742Z dist init r=1, world=4 2025-12-04T12:52:45.8278078Z [rank0]:[W1204 12:50:59.974819294 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8278117Z FAILED [8.9142s] [100%] 2025-12-04T12:52:45.8278119Z 2025-12-04T12:52:45.8278214Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8278347Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda _ 2025-12-04T12:52:45.8278393Z Traceback (most recent call last): 2025-12-04T12:52:45.8278571Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8278618Z self._join_processes(fn) 2025-12-04T12:52:45.8278792Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8278846Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8279024Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8279066Z raise RuntimeError(error) 2025-12-04T12:52:45.8279159Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.8279204Z Traceback (most recent call last): 2025-12-04T12:52:45.8279366Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8279409Z getattr(self, test_name)() 2025-12-04T12:52:45.8279568Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8279614Z fn() 2025-12-04T12:52:45.8279765Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8279819Z method(*args, **kwargs) 2025-12-04T12:52:45.8279972Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8280011Z method(*args, **kwargs) 2025-12-04T12:52:45.8280162Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8280199Z with policy(): 2025-12-04T12:52:45.8280351Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8280392Z raise RuntimeError(msg) 2025-12-04T12:52:45.8280781Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8280785Z 2025-12-04T12:52:45.8280862Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8281133Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8281135Z 2025-12-04T12:52:45.8281224Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8281226Z 2025-12-04T12:52:45.8281228Z 2025-12-04T12:52:45.8281303Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8281390Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8281620Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-687e88c5a858511f.xml - 2025-12-04T12:52:45.8281682Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8281961Z FAILED [8.9142s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.8282008Z Traceback (most recent call last): 2025-12-04T12:52:45.8282171Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8282214Z getattr(self, test_name)() 2025-12-04T12:52:45.8282385Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8282419Z fn() 2025-12-04T12:52:45.8282570Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8282612Z method(*args, **kwargs) 2025-12-04T12:52:45.8282762Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8282801Z method(*args, **kwargs) 2025-12-04T12:52:45.8282951Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8283000Z with policy(): 2025-12-04T12:52:45.8283154Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8283193Z raise RuntimeError(msg) 2025-12-04T12:52:45.8283580Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8283602Z 2025-12-04T12:52:45.8283676Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8283945Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8283946Z 2025-12-04T12:52:45.8284035Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8284097Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8284159Z ======================= 1 failed, 9 deselected in 8.92s ======================== 2025-12-04T12:52:45.8284196Z Got exit code 1 2025-12-04T12:52:45.8284415Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda 2025-12-04T12:52:45.8284542Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:52:45.8284733Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-865ac1d538c948a6.xml 2025-12-04T12:52:45.8284789Z ============================= test session starts ============================== 2025-12-04T12:52:45.8284902Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8284942Z cachedir: .pytest_cache 2025-12-04T12:52:45.8285103Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8285148Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8285189Z configfile: pytest.ini 2025-12-04T12:52:45.8285350Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8285424Z collecting ... collected 10 items / 7 deselected / 3 selected 2025-12-04T12:52:45.8285477Z stepcurrent: skipping 7 already run items. 2025-12-04T12:52:45.8285523Z Running 3 items in this shard 2025-12-04T12:52:45.8285525Z 2025-12-04T12:52:45.8285866Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda I1204 12:51:03.413000 504297 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 504366 2025-12-04T12:52:45.8286020Z I1204 12:51:03.414000 504297 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 504367 2025-12-04T12:52:45.8286181Z I1204 12:51:03.415000 504297 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 504368 2025-12-04T12:52:45.8286330Z I1204 12:51:03.415000 504297 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 504369 2025-12-04T12:52:45.8286829Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8286901Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8287392Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8287452Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8287946Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8288015Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8288535Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8288592Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8288736Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8288899Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8289189Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8289343Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8289628Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8289753Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8290032Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8290182Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8290459Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8290621Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8290896Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8291034Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8291327Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8291475Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8291989Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8292130Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8292325Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8292726Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8292840Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8293052Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8293216Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8293255Z dist init r=2, world=4 2025-12-04T12:52:45.8293397Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8293555Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8293842Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8293997Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8294281Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8294405Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8294683Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8294842Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8295117Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8295265Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8295550Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8295685Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8295962Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8296109Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8296639Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448. 2025-12-04T12:52:45.8296753Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8296949Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8297346Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8297461Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8297672Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8297836Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8297876Z dist init r=3, world=4 2025-12-04T12:52:45.8298013Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8298208Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8298494Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8298648Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8298933Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8299070Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8299347Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8299496Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8299771Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8299929Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8300204Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8300340Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8300644Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8300792Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8301301Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8301415Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8301610Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8302007Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8302120Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8302330Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8302493Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8302532Z dist init r=0, world=4 2025-12-04T12:52:45.8302671Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8302828Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8303114Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8303277Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8303559Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8303686Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8303960Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8304117Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8304393Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8304539Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8304824Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8304969Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8305246Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8305394Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8305902Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8306017Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8306213Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8306611Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8306723Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8306934Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8307099Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8307137Z dist init r=1, world=4 2025-12-04T12:52:45.8307474Z [rank0]:[W1204 12:51:10.436636023 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8307514Z FAILED [9.0133s] [ 33%] 2025-12-04T12:52:45.8307532Z 2025-12-04T12:52:45.8307588Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8307722Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda _ 2025-12-04T12:52:45.8307769Z Traceback (most recent call last): 2025-12-04T12:52:45.8307931Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8307975Z self._join_processes(fn) 2025-12-04T12:52:45.8308170Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8308243Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8308420Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8308463Z raise RuntimeError(error) 2025-12-04T12:52:45.8308544Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8308590Z Traceback (most recent call last): 2025-12-04T12:52:45.8308763Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8308819Z getattr(self, test_name)() 2025-12-04T12:52:45.8308977Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8309011Z fn() 2025-12-04T12:52:45.8309163Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8309203Z method(*args, **kwargs) 2025-12-04T12:52:45.8309353Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8309394Z method(*args, **kwargs) 2025-12-04T12:52:45.8309544Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8309582Z with policy(): 2025-12-04T12:52:45.8309734Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8309776Z raise RuntimeError(msg) 2025-12-04T12:52:45.8310159Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8310163Z 2025-12-04T12:52:45.8310237Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8310506Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8310508Z 2025-12-04T12:52:45.8310596Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8310599Z 2025-12-04T12:52:45.8310601Z 2025-12-04T12:52:45.8310677Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8310764Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8310999Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-865ac1d538c948a6.xml - 2025-12-04T12:52:45.8311059Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8311351Z FAILED [9.0133s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8311398Z Traceback (most recent call last): 2025-12-04T12:52:45.8311560Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8311604Z getattr(self, test_name)() 2025-12-04T12:52:45.8311764Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8311799Z fn() 2025-12-04T12:52:45.8311949Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8311989Z method(*args, **kwargs) 2025-12-04T12:52:45.8312148Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8312190Z method(*args, **kwargs) 2025-12-04T12:52:45.8312339Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8312377Z with policy(): 2025-12-04T12:52:45.8312527Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8312588Z raise RuntimeError(msg) 2025-12-04T12:52:45.8312972Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8312975Z 2025-12-04T12:52:45.8313050Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8313318Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8313322Z 2025-12-04T12:52:45.8313409Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8313472Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8313534Z ======================= 1 failed, 7 deselected in 9.02s ======================== 2025-12-04T12:52:45.8313572Z Got exit code 1 2025-12-04T12:52:45.8313611Z Retrying single test... 2025-12-04T12:52:45.8313798Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-681fc1044bc94934.xml 2025-12-04T12:52:45.8313855Z ============================= test session starts ============================== 2025-12-04T12:52:45.8313968Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8314007Z cachedir: .pytest_cache 2025-12-04T12:52:45.8314165Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8314210Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8314252Z configfile: pytest.ini 2025-12-04T12:52:45.8314413Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8314489Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8314751Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8314795Z Running 1 items in this shard 2025-12-04T12:52:45.8314797Z 2025-12-04T12:52:45.8315137Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda I1204 12:51:14.971000 504699 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 504768 2025-12-04T12:52:45.8315304Z I1204 12:51:14.973000 504699 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 504769 2025-12-04T12:52:45.8315456Z I1204 12:51:14.974000 504699 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 504770 2025-12-04T12:52:45.8315609Z I1204 12:51:14.974000 504699 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 504771 2025-12-04T12:52:45.8316118Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8316181Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8316668Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8316752Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8317236Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8317294Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8317777Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8317836Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8317979Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8318143Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8318467Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8318621Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8318906Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8319031Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8319308Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8319456Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8319744Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8319893Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8320166Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8320319Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8320597Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8320746Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8321267Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2466250752 and is now 3196059648. 2025-12-04T12:52:45.8321402Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8321597Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8321992Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8322110Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8322322Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8322488Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8322526Z dist init r=0, world=4 2025-12-04T12:52:45.8322667Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8322828Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8323115Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8323270Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8323554Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8323679Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8323963Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8324110Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8324385Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8324541Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8324817Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8324954Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8325232Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8325400Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8325910Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8326026Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8326220Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8326615Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8326730Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8326940Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8327103Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8327144Z dist init r=3, world=4 2025-12-04T12:52:45.8327283Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8327445Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8327732Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8327885Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8328363Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8328489Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8328766Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8328913Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8329201Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8329348Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8329622Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8329788Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8330065Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8330213Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8330723Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8330838Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8331033Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8331425Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8331539Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8331748Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8331913Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8331951Z dist init r=1, world=4 2025-12-04T12:52:45.8332091Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8332250Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8332545Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8332700Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8332984Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8333107Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8333390Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8333538Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8333814Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8333979Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8334254Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8334390Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8334670Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8334817Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8335327Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8335441Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8335636Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8336029Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8336144Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8336353Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8336517Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8336556Z dist init r=2, world=4 2025-12-04T12:52:45.8336902Z [rank0]:[W1204 12:51:22.006850826 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8336943Z FAILED [9.0130s] [100%] 2025-12-04T12:52:45.8336946Z 2025-12-04T12:52:45.8337002Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8337135Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda _ 2025-12-04T12:52:45.8337183Z Traceback (most recent call last): 2025-12-04T12:52:45.8337354Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8337399Z self._join_processes(fn) 2025-12-04T12:52:45.8337572Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8337627Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8337804Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8337863Z raise RuntimeError(error) 2025-12-04T12:52:45.8337954Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8338001Z Traceback (most recent call last): 2025-12-04T12:52:45.8338211Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8338254Z getattr(self, test_name)() 2025-12-04T12:52:45.8338413Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8338447Z fn() 2025-12-04T12:52:45.8338598Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8338639Z method(*args, **kwargs) 2025-12-04T12:52:45.8338788Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8338829Z method(*args, **kwargs) 2025-12-04T12:52:45.8338978Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8339016Z with policy(): 2025-12-04T12:52:45.8339169Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8339210Z raise RuntimeError(msg) 2025-12-04T12:52:45.8339594Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2466250752 and is now 3196059648. 2025-12-04T12:52:45.8339598Z 2025-12-04T12:52:45.8339675Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8339944Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8339947Z 2025-12-04T12:52:45.8340036Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8340038Z 2025-12-04T12:52:45.8340040Z 2025-12-04T12:52:45.8340115Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8340202Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8340435Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-681fc1044bc94934.xml - 2025-12-04T12:52:45.8340494Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8340788Z FAILED [9.0130s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8340836Z Traceback (most recent call last): 2025-12-04T12:52:45.8341001Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8341043Z getattr(self, test_name)() 2025-12-04T12:52:45.8341201Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8341247Z fn() 2025-12-04T12:52:45.8341399Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8341440Z method(*args, **kwargs) 2025-12-04T12:52:45.8341592Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8341633Z method(*args, **kwargs) 2025-12-04T12:52:45.8341782Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8341845Z with policy(): 2025-12-04T12:52:45.8341995Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8342035Z raise RuntimeError(msg) 2025-12-04T12:52:45.8342423Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2466250752 and is now 3196059648. 2025-12-04T12:52:45.8342425Z 2025-12-04T12:52:45.8342500Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8342766Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8342770Z 2025-12-04T12:52:45.8342857Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8342922Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8342983Z ======================= 1 failed, 9 deselected in 9.02s ======================== 2025-12-04T12:52:45.8343020Z Got exit code 1 2025-12-04T12:52:45.8343059Z Retrying single test... 2025-12-04T12:52:45.8343248Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-9dd64414167d3313.xml 2025-12-04T12:52:45.8343306Z ============================= test session starts ============================== 2025-12-04T12:52:45.8343419Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8343459Z cachedir: .pytest_cache 2025-12-04T12:52:45.8343618Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8343665Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8343707Z configfile: pytest.ini 2025-12-04T12:52:45.8343869Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8343943Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8344205Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8344249Z Running 1 items in this shard 2025-12-04T12:52:45.8344252Z 2025-12-04T12:52:45.8344598Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda I1204 12:51:26.607000 505101 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 505170 2025-12-04T12:52:45.8344756Z I1204 12:51:26.608000 505101 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 505171 2025-12-04T12:52:45.8344910Z I1204 12:51:26.609000 505101 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 505172 2025-12-04T12:52:45.8345060Z I1204 12:51:26.609000 505101 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 505173 2025-12-04T12:52:45.8345566Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8345628Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8346128Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8346198Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8346682Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8346740Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8347221Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8347279Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8347423Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8347585Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8347876Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8348031Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8348365Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8348491Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8348783Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8348931Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8349207Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8349356Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8349641Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8349779Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8350056Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8352932Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8353463Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8353582Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8353782Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8354185Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8354303Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8354515Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8354681Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:52:45.8354722Z dist init r=3, world=4 2025-12-04T12:52:45.8354864Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8355023Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8355313Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8355468Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8355781Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8355907Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8356184Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8356333Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8356619Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8356766Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8357041Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8357188Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8357485Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8357633Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8358198Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8358315Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8358511Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8358907Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8359020Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8359231Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8359394Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8359435Z dist init r=1, world=4 2025-12-04T12:52:45.8359572Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8359732Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8360018Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8360187Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8360474Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8360600Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8360888Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8361035Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8361311Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8361469Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8361757Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8361894Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8362168Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8362317Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8362827Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096. 2025-12-04T12:52:45.8362944Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8363139Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8363534Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8363650Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8363860Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8364024Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:52:45.8364062Z dist init r=2, world=4 2025-12-04T12:52:45.8364199Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8364366Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8364651Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8364806Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8365099Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8365224Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8365500Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8365647Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8365941Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8366088Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8366363Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8366499Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8366774Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8366923Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8367434Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8367549Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8367744Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8368139Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8368294Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8368503Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8368680Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8368719Z dist init r=0, world=4 2025-12-04T12:52:45.8369056Z [rank0]:[W1204 12:51:33.583180533 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8369097Z FAILED [9.0136s] [100%] 2025-12-04T12:52:45.8369100Z 2025-12-04T12:52:45.8369157Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8369306Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda _ 2025-12-04T12:52:45.8369354Z Traceback (most recent call last): 2025-12-04T12:52:45.8369517Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8369562Z self._join_processes(fn) 2025-12-04T12:52:45.8369734Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8369803Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8369999Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8370043Z raise RuntimeError(error) 2025-12-04T12:52:45.8370125Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8370170Z Traceback (most recent call last): 2025-12-04T12:52:45.8370331Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8370374Z getattr(self, test_name)() 2025-12-04T12:52:45.8370532Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8370568Z fn() 2025-12-04T12:52:45.8370719Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8370762Z method(*args, **kwargs) 2025-12-04T12:52:45.8370913Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8370954Z method(*args, **kwargs) 2025-12-04T12:52:45.8371103Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8371140Z with policy(): 2025-12-04T12:52:45.8371292Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8371333Z raise RuntimeError(msg) 2025-12-04T12:52:45.8371721Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8371726Z 2025-12-04T12:52:45.8371801Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8372071Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8372073Z 2025-12-04T12:52:45.8372162Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8372165Z 2025-12-04T12:52:45.8372226Z Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8372270Z Traceback (most recent call last): 2025-12-04T12:52:45.8372433Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8372484Z getattr(self, test_name)() 2025-12-04T12:52:45.8372644Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8372679Z fn() 2025-12-04T12:52:45.8372832Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8372871Z method(*args, **kwargs) 2025-12-04T12:52:45.8373021Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8373060Z method(*args, **kwargs) 2025-12-04T12:52:45.8373220Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8373257Z with policy(): 2025-12-04T12:52:45.8373409Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8373451Z raise RuntimeError(msg) 2025-12-04T12:52:45.8373834Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8373866Z 2025-12-04T12:52:45.8373940Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8374208Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8374209Z 2025-12-04T12:52:45.8374298Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8374300Z 2025-12-04T12:52:45.8374358Z Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.8374405Z Traceback (most recent call last): 2025-12-04T12:52:45.8374567Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8374610Z getattr(self, test_name)() 2025-12-04T12:52:45.8374769Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8374804Z fn() 2025-12-04T12:52:45.8374954Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8374993Z method(*args, **kwargs) 2025-12-04T12:52:45.8375145Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8375184Z method(*args, **kwargs) 2025-12-04T12:52:45.8375333Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8375370Z with policy(): 2025-12-04T12:52:45.8375521Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8375563Z raise RuntimeError(msg) 2025-12-04T12:52:45.8375946Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8375949Z 2025-12-04T12:52:45.8376021Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8376287Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8376289Z 2025-12-04T12:52:45.8376384Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8376387Z 2025-12-04T12:52:45.8376389Z 2025-12-04T12:52:45.8376467Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8376555Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8376789Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-9dd64414167d3313.xml - 2025-12-04T12:52:45.8376850Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8377144Z FAILED [9.0136s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8377191Z Traceback (most recent call last): 2025-12-04T12:52:45.8377354Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8377396Z getattr(self, test_name)() 2025-12-04T12:52:45.8377565Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8377609Z fn() 2025-12-04T12:52:45.8377759Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8377799Z method(*args, **kwargs) 2025-12-04T12:52:45.8377948Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8377988Z method(*args, **kwargs) 2025-12-04T12:52:45.8378136Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8378216Z with policy(): 2025-12-04T12:52:45.8378366Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8378407Z raise RuntimeError(msg) 2025-12-04T12:52:45.8378790Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648. 2025-12-04T12:52:45.8378794Z 2025-12-04T12:52:45.8378867Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8379137Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8379139Z 2025-12-04T12:52:45.8379226Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8379228Z 2025-12-04T12:52:45.8379287Z Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8379331Z Traceback (most recent call last): 2025-12-04T12:52:45.8379495Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8379537Z getattr(self, test_name)() 2025-12-04T12:52:45.8379696Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8379729Z fn() 2025-12-04T12:52:45.8379881Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8379921Z method(*args, **kwargs) 2025-12-04T12:52:45.8380070Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8380126Z method(*args, **kwargs) 2025-12-04T12:52:45.8380275Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8380313Z with policy(): 2025-12-04T12:52:45.8380464Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8380506Z raise RuntimeError(msg) 2025-12-04T12:52:45.8380902Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312. 2025-12-04T12:52:45.8380904Z 2025-12-04T12:52:45.8380977Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8381243Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8381244Z 2025-12-04T12:52:45.8381331Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8381347Z 2025-12-04T12:52:45.8381417Z Process 3 exited with error code 10 and exception: 2025-12-04T12:52:45.8381462Z Traceback (most recent call last): 2025-12-04T12:52:45.8381624Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8381666Z getattr(self, test_name)() 2025-12-04T12:52:45.8381825Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8381858Z fn() 2025-12-04T12:52:45.8382007Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8382046Z method(*args, **kwargs) 2025-12-04T12:52:45.8382195Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8382234Z method(*args, **kwargs) 2025-12-04T12:52:45.8382383Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8382419Z with policy(): 2025-12-04T12:52:45.8382569Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8382609Z raise RuntimeError(msg) 2025-12-04T12:52:45.8382991Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448. 2025-12-04T12:52:45.8382993Z 2025-12-04T12:52:45.8383065Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8383330Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8383333Z 2025-12-04T12:52:45.8383422Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8383485Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8383548Z ======================= 1 failed, 9 deselected in 9.02s ======================== 2025-12-04T12:52:45.8383585Z Got exit code 1 2025-12-04T12:52:45.8383806Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda 2025-12-04T12:52:45.8383944Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:52:45.8384135Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-4c47fa7ebdbb029a.xml 2025-12-04T12:52:45.8384194Z ============================= test session starts ============================== 2025-12-04T12:52:45.8384308Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8384349Z cachedir: .pytest_cache 2025-12-04T12:52:45.8384506Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8384553Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8384603Z configfile: pytest.ini 2025-12-04T12:52:45.8384765Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8384839Z collecting ... collected 10 items / 8 deselected / 2 selected 2025-12-04T12:52:45.8384892Z stepcurrent: skipping 8 already run items. 2025-12-04T12:52:45.8384936Z Running 2 items in this shard 2025-12-04T12:52:45.8384938Z 2025-12-04T12:52:45.8385253Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda I1204 12:51:37.919000 505503 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 505572 2025-12-04T12:52:45.8385418Z I1204 12:51:37.920000 505503 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 505573 2025-12-04T12:52:45.8385917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8385980Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8386469Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8386531Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8387612Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8387739Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8388861Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8388986Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8389129Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8389314Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8389605Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8389759Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8391710Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8391851Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8392129Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8392278Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8392555Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8392704Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8392979Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8393116Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8393394Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8393541Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8394020Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680. 2025-12-04T12:52:45.8394136Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8394332Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8394699Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8394816Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8395030Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8395203Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8395244Z dist init r=1, world=2 2025-12-04T12:52:45.8395382Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8395542Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8395827Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8396003Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8396287Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8396412Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8396687Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8396833Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8397109Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8397257Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8397533Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8397669Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8397945Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8398095Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8398621Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8398757Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8398952Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8399309Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8399423Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8399647Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8399814Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8399853Z dist init r=0, world=2 2025-12-04T12:52:45.8400188Z [rank0]:[W1204 12:51:44.562634456 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8400251Z FAILED [8.6135s] [ 50%] 2025-12-04T12:52:45.8400254Z 2025-12-04T12:52:45.8400309Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8400408Z ____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda _____ 2025-12-04T12:52:45.8400455Z Traceback (most recent call last): 2025-12-04T12:52:45.8400618Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8400661Z self._join_processes(fn) 2025-12-04T12:52:45.8400833Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8400888Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8401064Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8401109Z raise RuntimeError(error) 2025-12-04T12:52:45.8401189Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8401234Z Traceback (most recent call last): 2025-12-04T12:52:45.8401395Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8401436Z getattr(self, test_name)() 2025-12-04T12:52:45.8401593Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8401627Z fn() 2025-12-04T12:52:45.8401780Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8401821Z method(*args, **kwargs) 2025-12-04T12:52:45.8401973Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8402013Z method(*args, **kwargs) 2025-12-04T12:52:45.8402164Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8402200Z with policy(): 2025-12-04T12:52:45.8402352Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8402392Z raise RuntimeError(msg) 2025-12-04T12:52:45.8402751Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8402754Z 2025-12-04T12:52:45.8402828Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8403059Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8403062Z 2025-12-04T12:52:45.8403150Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8403152Z 2025-12-04T12:52:45.8403210Z Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8403267Z Traceback (most recent call last): 2025-12-04T12:52:45.8403429Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8403472Z getattr(self, test_name)() 2025-12-04T12:52:45.8403630Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8403665Z fn() 2025-12-04T12:52:45.8403814Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8403879Z method(*args, **kwargs) 2025-12-04T12:52:45.8404028Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8404068Z method(*args, **kwargs) 2025-12-04T12:52:45.8404218Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8404256Z with policy(): 2025-12-04T12:52:45.8404406Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8404447Z raise RuntimeError(msg) 2025-12-04T12:52:45.8404792Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680. 2025-12-04T12:52:45.8404796Z 2025-12-04T12:52:45.8404869Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8405098Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8405100Z 2025-12-04T12:52:45.8405187Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8405189Z 2025-12-04T12:52:45.8405191Z 2025-12-04T12:52:45.8405266Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8405352Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8405587Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-4c47fa7ebdbb029a.xml - 2025-12-04T12:52:45.8405647Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8405892Z FAILED [8.6135s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8405939Z Traceback (most recent call last): 2025-12-04T12:52:45.8406102Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8406144Z getattr(self, test_name)() 2025-12-04T12:52:45.8406303Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8406336Z fn() 2025-12-04T12:52:45.8406506Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8406547Z method(*args, **kwargs) 2025-12-04T12:52:45.8406698Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8406738Z method(*args, **kwargs) 2025-12-04T12:52:45.8406886Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8406926Z with policy(): 2025-12-04T12:52:45.8407086Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8407128Z raise RuntimeError(msg) 2025-12-04T12:52:45.8407476Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8407478Z 2025-12-04T12:52:45.8407551Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8407789Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8407802Z 2025-12-04T12:52:45.8407890Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8407893Z 2025-12-04T12:52:45.8407950Z Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8407999Z Traceback (most recent call last): 2025-12-04T12:52:45.8408187Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8408231Z getattr(self, test_name)() 2025-12-04T12:52:45.8408391Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8408425Z fn() 2025-12-04T12:52:45.8408575Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8408616Z method(*args, **kwargs) 2025-12-04T12:52:45.8408765Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8408803Z method(*args, **kwargs) 2025-12-04T12:52:45.8408952Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8408989Z with policy(): 2025-12-04T12:52:45.8409140Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8409180Z raise RuntimeError(msg) 2025-12-04T12:52:45.8409526Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680. 2025-12-04T12:52:45.8409530Z 2025-12-04T12:52:45.8409602Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8409829Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8409832Z 2025-12-04T12:52:45.8409917Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8409982Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8410044Z ======================= 1 failed, 8 deselected in 8.62s ======================== 2025-12-04T12:52:45.8410081Z Got exit code 1 2025-12-04T12:52:45.8410121Z Retrying single test... 2025-12-04T12:52:45.8410326Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a73d7424feda7a29.xml 2025-12-04T12:52:45.8410386Z ============================= test session starts ============================== 2025-12-04T12:52:45.8410499Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8410540Z cachedir: .pytest_cache 2025-12-04T12:52:45.8410696Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8410742Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8410797Z configfile: pytest.ini 2025-12-04T12:52:45.8410961Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8411033Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8411254Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8411313Z Running 1 items in this shard 2025-12-04T12:52:45.8411315Z 2025-12-04T12:52:45.8411631Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda I1204 12:51:48.849000 505739 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 505808 2025-12-04T12:52:45.8411784Z I1204 12:51:48.850000 505739 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 505809 2025-12-04T12:52:45.8412281Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8412343Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8412829Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8412890Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8413969Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8414094Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8415160Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8415284Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8415436Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8415598Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8415889Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8416056Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8416353Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8416479Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8416756Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8416904Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8417179Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8417327Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8417602Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8417737Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8418014Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8418220Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8418700Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8418814Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8419035Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8419392Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8419508Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8419719Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8419894Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8419934Z dist init r=0, world=2 2025-12-04T12:52:45.8420072Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8420230Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8420530Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8420707Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8420992Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8421117Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8421392Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8421540Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8421814Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8421962Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8422235Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8422371Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8422646Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8422795Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8423285Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680. 2025-12-04T12:52:45.8423399Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8423595Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8423951Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8424078Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8424289Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8424454Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8424493Z dist init r=1, world=2 2025-12-04T12:52:45.8424840Z [rank0]:[W1204 12:51:55.448252263 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8424892Z FAILED [8.5120s] [100%] 2025-12-04T12:52:45.8424894Z 2025-12-04T12:52:45.8424949Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8425050Z ____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda _____ 2025-12-04T12:52:45.8425096Z Traceback (most recent call last): 2025-12-04T12:52:45.8425260Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8425303Z self._join_processes(fn) 2025-12-04T12:52:45.8425477Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8425530Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8425709Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8425752Z raise RuntimeError(error) 2025-12-04T12:52:45.8425832Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8425877Z Traceback (most recent call last): 2025-12-04T12:52:45.8426039Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8426080Z getattr(self, test_name)() 2025-12-04T12:52:45.8426239Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8426274Z fn() 2025-12-04T12:52:45.8426424Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8426465Z method(*args, **kwargs) 2025-12-04T12:52:45.8426615Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8426656Z method(*args, **kwargs) 2025-12-04T12:52:45.8426806Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8426843Z with policy(): 2025-12-04T12:52:45.8426994Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8427035Z raise RuntimeError(msg) 2025-12-04T12:52:45.8427396Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8427399Z 2025-12-04T12:52:45.8427474Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8427704Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8427706Z 2025-12-04T12:52:45.8427794Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8427796Z 2025-12-04T12:52:45.8427808Z 2025-12-04T12:52:45.8427884Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8427970Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8428238Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a73d7424feda7a29.xml - 2025-12-04T12:52:45.8428298Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8428558Z FAILED [8.5120s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8428619Z Traceback (most recent call last): 2025-12-04T12:52:45.8428782Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8428824Z getattr(self, test_name)() 2025-12-04T12:52:45.8428983Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8429017Z fn() 2025-12-04T12:52:45.8429169Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8429208Z method(*args, **kwargs) 2025-12-04T12:52:45.8429360Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8429400Z method(*args, **kwargs) 2025-12-04T12:52:45.8429550Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8429586Z with policy(): 2025-12-04T12:52:45.8429738Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8429778Z raise RuntimeError(msg) 2025-12-04T12:52:45.8430130Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8430133Z 2025-12-04T12:52:45.8430206Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8430433Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8430436Z 2025-12-04T12:52:45.8430523Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8430585Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8430647Z ======================= 1 failed, 9 deselected in 8.52s ======================== 2025-12-04T12:52:45.8430685Z Got exit code 1 2025-12-04T12:52:45.8430725Z Retrying single test... 2025-12-04T12:52:45.8430913Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-8df6fab4d75749e9.xml 2025-12-04T12:52:45.8430984Z ============================= test session starts ============================== 2025-12-04T12:52:45.8431095Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8431137Z cachedir: .pytest_cache 2025-12-04T12:52:45.8431294Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8431341Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8431380Z configfile: pytest.ini 2025-12-04T12:52:45.8431542Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8431629Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8431850Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8431894Z Running 1 items in this shard 2025-12-04T12:52:45.8431897Z 2025-12-04T12:52:45.8432200Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda I1204 12:51:59.754000 505975 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 506044 2025-12-04T12:52:45.8432377Z I1204 12:51:59.755000 505975 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 506045 2025-12-04T12:52:45.8432871Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8432933Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8433419Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8433483Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8434557Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8434682Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8435750Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8435874Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8436017Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8436182Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8436484Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8436638Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8436925Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8437078Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8437357Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8437504Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8437780Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8437927Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8438238Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8438374Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8438652Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8438801Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8439273Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8439390Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8439586Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8439964Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8440078Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8440290Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8440454Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8440493Z dist init r=0, world=2 2025-12-04T12:52:45.8440645Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8440804Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8441093Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8441273Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8441555Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8441682Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8441957Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8442104Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8442380Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8442527Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8442802Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8442937Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8443216Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8443366Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8443840Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680. 2025-12-04T12:52:45.8443953Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8444162Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8444517Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8444631Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8444851Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8445015Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8445054Z dist init r=1, world=2 2025-12-04T12:52:45.8445391Z [rank0]:[W1204 12:52:07.971638389 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8445449Z FAILED [9.4118s] [100%] 2025-12-04T12:52:45.8445451Z 2025-12-04T12:52:45.8445507Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8445607Z ____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda _____ 2025-12-04T12:52:45.8445653Z Traceback (most recent call last): 2025-12-04T12:52:45.8445815Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8445859Z self._join_processes(fn) 2025-12-04T12:52:45.8446031Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8446087Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8446263Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8446308Z raise RuntimeError(error) 2025-12-04T12:52:45.8446387Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8446432Z Traceback (most recent call last): 2025-12-04T12:52:45.8446591Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8446634Z getattr(self, test_name)() 2025-12-04T12:52:45.8446790Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8446825Z fn() 2025-12-04T12:52:45.8446975Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8447016Z method(*args, **kwargs) 2025-12-04T12:52:45.8447165Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8447206Z method(*args, **kwargs) 2025-12-04T12:52:45.8447356Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8447393Z with policy(): 2025-12-04T12:52:45.8447544Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8447584Z raise RuntimeError(msg) 2025-12-04T12:52:45.8447932Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8447946Z 2025-12-04T12:52:45.8448021Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8448294Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8448298Z 2025-12-04T12:52:45.8448386Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8448388Z 2025-12-04T12:52:45.8448390Z 2025-12-04T12:52:45.8448464Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8448565Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8448799Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-8df6fab4d75749e9.xml - 2025-12-04T12:52:45.8448859Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8449104Z FAILED [9.4118s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8449163Z Traceback (most recent call last): 2025-12-04T12:52:45.8449339Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8449381Z getattr(self, test_name)() 2025-12-04T12:52:45.8449541Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8449576Z fn() 2025-12-04T12:52:45.8449727Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8449767Z method(*args, **kwargs) 2025-12-04T12:52:45.8449918Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8449958Z method(*args, **kwargs) 2025-12-04T12:52:45.8450107Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8450145Z with policy(): 2025-12-04T12:52:45.8450295Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8450336Z raise RuntimeError(msg) 2025-12-04T12:52:45.8450683Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8450685Z 2025-12-04T12:52:45.8450759Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8450987Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8450990Z 2025-12-04T12:52:45.8451077Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8451141Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8451201Z ======================= 1 failed, 9 deselected in 9.42s ======================== 2025-12-04T12:52:45.8451238Z Got exit code 1 2025-12-04T12:52:45.8451414Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda 2025-12-04T12:52:45.8451542Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:52:45.8451730Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-7ec29cbcb4eb0a53.xml 2025-12-04T12:52:45.8451801Z ============================= test session starts ============================== 2025-12-04T12:52:45.8451912Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8451955Z cachedir: .pytest_cache 2025-12-04T12:52:45.8452112Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8452157Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8452197Z configfile: pytest.ini 2025-12-04T12:52:45.8452376Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8452449Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8452502Z stepcurrent: skipping 9 already run items. 2025-12-04T12:52:45.8452545Z Running 1 items in this shard 2025-12-04T12:52:45.8452547Z 2025-12-04T12:52:45.8452852Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda I1204 12:52:11.471000 506211 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 506280 2025-12-04T12:52:45.8453019Z I1204 12:52:11.472000 506211 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 506281 2025-12-04T12:52:45.8453525Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8453587Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8454075Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8454137Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8455207Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8455331Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8456403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8456528Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8456672Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8456834Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8457133Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8457288Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8457573Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8457721Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8457997Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8458197Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8458473Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8458619Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8458899Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8459036Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8459313Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8459462Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8459933Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928. 2025-12-04T12:52:45.8460049Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8460245Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8460620Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8460734Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8460946Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8461112Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8461151Z dist init r=0, world=2 2025-12-04T12:52:45.8461301Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8461459Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8461746Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8461899Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8462207Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8462331Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8462608Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8462755Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8463029Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8463177Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8463454Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8463590Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8463867Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8464014Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8464485Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680. 2025-12-04T12:52:45.8464598Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8464804Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8465156Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8465271Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8465481Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8465655Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8465694Z dist init r=1, world=2 2025-12-04T12:52:45.8466033Z [rank0]:[W1204 12:52:18.165898650 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8466083Z FAILED [8.6119s] [100%] 2025-12-04T12:52:45.8466085Z 2025-12-04T12:52:45.8466149Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8466248Z _____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda _____ 2025-12-04T12:52:45.8466295Z Traceback (most recent call last): 2025-12-04T12:52:45.8466456Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8466502Z self._join_processes(fn) 2025-12-04T12:52:45.8466675Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8466727Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8466905Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8466948Z raise RuntimeError(error) 2025-12-04T12:52:45.8467028Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8467074Z Traceback (most recent call last): 2025-12-04T12:52:45.8467236Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8467280Z getattr(self, test_name)() 2025-12-04T12:52:45.8467437Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8467472Z fn() 2025-12-04T12:52:45.8467623Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8467663Z method(*args, **kwargs) 2025-12-04T12:52:45.8467814Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8467854Z method(*args, **kwargs) 2025-12-04T12:52:45.8468003Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8468041Z with policy(): 2025-12-04T12:52:45.8468235Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8468276Z raise RuntimeError(msg) 2025-12-04T12:52:45.8468620Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928. 2025-12-04T12:52:45.8468623Z 2025-12-04T12:52:45.8468726Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8468953Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8468958Z 2025-12-04T12:52:45.8469046Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8469049Z 2025-12-04T12:52:45.8469050Z 2025-12-04T12:52:45.8469125Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8469211Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8469466Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-7ec29cbcb4eb0a53.xml - 2025-12-04T12:52:45.8469526Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8469768Z FAILED [8.6119s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8469813Z Traceback (most recent call last): 2025-12-04T12:52:45.8469992Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8470046Z getattr(self, test_name)() 2025-12-04T12:52:45.8470205Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8470239Z fn() 2025-12-04T12:52:45.8470392Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8470431Z method(*args, **kwargs) 2025-12-04T12:52:45.8470583Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8470622Z method(*args, **kwargs) 2025-12-04T12:52:45.8470772Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8470809Z with policy(): 2025-12-04T12:52:45.8470960Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8471002Z raise RuntimeError(msg) 2025-12-04T12:52:45.8471349Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928. 2025-12-04T12:52:45.8471351Z 2025-12-04T12:52:45.8471425Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8471652Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8471654Z 2025-12-04T12:52:45.8471741Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8471804Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8471867Z ======================= 1 failed, 9 deselected in 8.62s ======================== 2025-12-04T12:52:45.8471903Z Got exit code 1 2025-12-04T12:52:45.8471944Z Retrying single test... 2025-12-04T12:52:45.8472131Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-e214515fa2a46151.xml 2025-12-04T12:52:45.8472190Z ============================= test session starts ============================== 2025-12-04T12:52:45.8472300Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8472341Z cachedir: .pytest_cache 2025-12-04T12:52:45.8472514Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8472565Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8472606Z configfile: pytest.ini 2025-12-04T12:52:45.8472768Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8472841Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8473059Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8473103Z Running 1 items in this shard 2025-12-04T12:52:45.8473114Z 2025-12-04T12:52:45.8473414Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda I1204 12:52:22.769000 506447 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 506516 2025-12-04T12:52:45.8473569Z I1204 12:52:22.770000 506447 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 506517 2025-12-04T12:52:45.8474062Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8474146Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8474635Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8474694Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8475769Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8475896Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8476958Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8477092Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8477236Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8477398Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8477690Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8477853Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8478140Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8478293Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8478585Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8478745Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8479021Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8479168Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8479441Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8479579Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8479859Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8480009Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8480484Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928. 2025-12-04T12:52:45.8480598Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8480794Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8481147Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8481260Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8481483Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8481648Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8481688Z dist init r=0, world=2 2025-12-04T12:52:45.8481826Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8481986Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8482286Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8482440Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8482723Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8482872Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8483147Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8483294Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8483568Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8483715Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8483990Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8484124Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8484401Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8484550Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8485021Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680. 2025-12-04T12:52:45.8485137Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8485331Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8485696Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8485809Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8486022Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8486185Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8486224Z dist init r=1, world=2 2025-12-04T12:52:45.8486567Z [rank0]:[W1204 12:52:29.648773767 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8486608Z FAILED [8.8113s] [100%] 2025-12-04T12:52:45.8486611Z 2025-12-04T12:52:45.8486667Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8486765Z _____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda _____ 2025-12-04T12:52:45.8486830Z Traceback (most recent call last): 2025-12-04T12:52:45.8486992Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8487035Z self._join_processes(fn) 2025-12-04T12:52:45.8487207Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8487261Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8487437Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8487482Z raise RuntimeError(error) 2025-12-04T12:52:45.8487561Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8487606Z Traceback (most recent call last): 2025-12-04T12:52:45.8487766Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8487811Z getattr(self, test_name)() 2025-12-04T12:52:45.8487967Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8488002Z fn() 2025-12-04T12:52:45.8488195Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8488236Z method(*args, **kwargs) 2025-12-04T12:52:45.8488387Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8488428Z method(*args, **kwargs) 2025-12-04T12:52:45.8488578Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8488615Z with policy(): 2025-12-04T12:52:45.8488766Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8488808Z raise RuntimeError(msg) 2025-12-04T12:52:45.8489155Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928. 2025-12-04T12:52:45.8489158Z 2025-12-04T12:52:45.8489233Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8489460Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8489487Z 2025-12-04T12:52:45.8489574Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8489577Z 2025-12-04T12:52:45.8489579Z 2025-12-04T12:52:45.8489654Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8489742Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8489973Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-e214515fa2a46151.xml - 2025-12-04T12:52:45.8490032Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8490288Z FAILED [8.8113s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8490335Z Traceback (most recent call last): 2025-12-04T12:52:45.8490498Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8490541Z getattr(self, test_name)() 2025-12-04T12:52:45.8490711Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8490763Z fn() 2025-12-04T12:52:45.8490914Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8490956Z method(*args, **kwargs) 2025-12-04T12:52:45.8491106Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8491146Z method(*args, **kwargs) 2025-12-04T12:52:45.8491297Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8491336Z with policy(): 2025-12-04T12:52:45.8491489Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8491530Z raise RuntimeError(msg) 2025-12-04T12:52:45.8491877Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928. 2025-12-04T12:52:45.8491880Z 2025-12-04T12:52:45.8491956Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8492183Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8492187Z 2025-12-04T12:52:45.8492272Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8492336Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8492397Z ======================= 1 failed, 9 deselected in 8.82s ======================== 2025-12-04T12:52:45.8492436Z Got exit code 1 2025-12-04T12:52:45.8492475Z Retrying single test... 2025-12-04T12:52:45.8492664Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-794d127c8ae76675.xml 2025-12-04T12:52:45.8492721Z ============================= test session starts ============================== 2025-12-04T12:52:45.8492833Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8492874Z cachedir: .pytest_cache 2025-12-04T12:52:45.8493032Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8493077Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8493117Z configfile: pytest.ini 2025-12-04T12:52:45.8493289Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8493362Z collecting ... collected 10 items / 9 deselected / 1 selected 2025-12-04T12:52:45.8493583Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8493629Z Running 1 items in this shard 2025-12-04T12:52:45.8493631Z 2025-12-04T12:52:45.8493943Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda I1204 12:52:34.186000 506683 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 506752 2025-12-04T12:52:45.8494098Z I1204 12:52:34.187000 506683 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 506753 2025-12-04T12:52:45.8494593Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8494675Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8495163Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:52:45.8495223Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:52:45.8496300Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8496426Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8497480Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:52:45.8497606Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:52:45.8497750Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8497920Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8498245Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8498401Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8498699Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8498824Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8499101Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8499268Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8499556Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8499705Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8499979Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8500117Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8500394Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8500543Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8501018Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680. 2025-12-04T12:52:45.8501133Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8501329Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8501684Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8501798Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8502009Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8502185Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:52:45.8502225Z dist init r=1, world=2 2025-12-04T12:52:45.8502362Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:52:45.8502522Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:52:45.8502806Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8502973Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:52:45.8503259Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8503384Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:52:45.8503681Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8503827Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8504101Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8504248Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:52:45.8504522Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8504658Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:52:45.8504935Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8505084Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:52:45.8505555Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8505672Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8505867Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8506220Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8506343Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:52:45.8506554Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8506720Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:52:45.8506759Z dist init r=0, world=2 2025-12-04T12:52:45.8507107Z [rank0]:[W1204 12:52:41.936098727 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:52:45.8507146Z FAILED [8.7109s] [100%] 2025-12-04T12:52:45.8507148Z 2025-12-04T12:52:45.8507203Z =================================== FAILURES =================================== 2025-12-04T12:52:45.8507304Z _____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda _____ 2025-12-04T12:52:45.8507350Z Traceback (most recent call last): 2025-12-04T12:52:45.8507510Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:52:45.8507572Z self._join_processes(fn) 2025-12-04T12:52:45.8507744Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:52:45.8507798Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:52:45.8507976Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:52:45.8508020Z raise RuntimeError(error) 2025-12-04T12:52:45.8508099Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8508182Z Traceback (most recent call last): 2025-12-04T12:52:45.8508343Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8508385Z getattr(self, test_name)() 2025-12-04T12:52:45.8508544Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8508580Z fn() 2025-12-04T12:52:45.8508731Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8508770Z method(*args, **kwargs) 2025-12-04T12:52:45.8508921Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8508962Z method(*args, **kwargs) 2025-12-04T12:52:45.8509111Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8509147Z with policy(): 2025-12-04T12:52:45.8509299Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8509339Z raise RuntimeError(msg) 2025-12-04T12:52:45.8509684Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8509687Z 2025-12-04T12:52:45.8509762Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8509990Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8509992Z 2025-12-04T12:52:45.8510080Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8510082Z 2025-12-04T12:52:45.8510156Z Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8510203Z Traceback (most recent call last): 2025-12-04T12:52:45.8510364Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8510407Z getattr(self, test_name)() 2025-12-04T12:52:45.8510566Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8510601Z fn() 2025-12-04T12:52:45.8510751Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8510791Z method(*args, **kwargs) 2025-12-04T12:52:45.8510954Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8510994Z method(*args, **kwargs) 2025-12-04T12:52:45.8511143Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8511180Z with policy(): 2025-12-04T12:52:45.8511329Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8511397Z raise RuntimeError(msg) 2025-12-04T12:52:45.8511739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680. 2025-12-04T12:52:45.8511741Z 2025-12-04T12:52:45.8511816Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8512041Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8512044Z 2025-12-04T12:52:45.8512131Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8512133Z 2025-12-04T12:52:45.8512134Z 2025-12-04T12:52:45.8512210Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:52:45.8512298Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:52:45.8512531Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-794d127c8ae76675.xml - 2025-12-04T12:52:45.8512591Z =========================== short test summary info ============================ 2025-12-04T12:52:45.8512833Z FAILED [8.7109s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:52:45.8512878Z Traceback (most recent call last): 2025-12-04T12:52:45.8513043Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8513085Z getattr(self, test_name)() 2025-12-04T12:52:45.8513244Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8513280Z fn() 2025-12-04T12:52:45.8513430Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8513472Z method(*args, **kwargs) 2025-12-04T12:52:45.8513621Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8513661Z method(*args, **kwargs) 2025-12-04T12:52:45.8513809Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8513846Z with policy(): 2025-12-04T12:52:45.8514008Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8514049Z raise RuntimeError(msg) 2025-12-04T12:52:45.8514393Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928. 2025-12-04T12:52:45.8514397Z 2025-12-04T12:52:45.8514471Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8514711Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8514713Z 2025-12-04T12:52:45.8514802Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8514805Z 2025-12-04T12:52:45.8514863Z Process 1 exited with error code 10 and exception: 2025-12-04T12:52:45.8514909Z Traceback (most recent call last): 2025-12-04T12:52:45.8515073Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:52:45.8515126Z getattr(self, test_name)() 2025-12-04T12:52:45.8515296Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:52:45.8515330Z fn() 2025-12-04T12:52:45.8515480Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8515519Z method(*args, **kwargs) 2025-12-04T12:52:45.8515669Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:52:45.8515707Z method(*args, **kwargs) 2025-12-04T12:52:45.8515856Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:52:45.8515893Z with policy(): 2025-12-04T12:52:45.8516043Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:52:45.8516084Z raise RuntimeError(msg) 2025-12-04T12:52:45.8516430Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680. 2025-12-04T12:52:45.8516433Z 2025-12-04T12:52:45.8516505Z To execute this test, run the following from the base repo dir: 2025-12-04T12:52:45.8516731Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8516733Z 2025-12-04T12:52:45.8516820Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:52:45.8516885Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:52:45.8516947Z ======================= 1 failed, 9 deselected in 8.72s ======================== 2025-12-04T12:52:45.8516985Z Got exit code 1 2025-12-04T12:52:45.8517163Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda 2025-12-04T12:52:45.8517291Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:52:45.8517479Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-16355a924f1fdd43.xml 2025-12-04T12:52:45.8517537Z ============================= test session starts ============================== 2025-12-04T12:52:45.8517648Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T12:52:45.8517704Z cachedir: .pytest_cache 2025-12-04T12:52:45.8517862Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:52:45.8517908Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:52:45.8517952Z configfile: pytest.ini 2025-12-04T12:52:45.8518112Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:52:45.8518226Z collecting ... collected 10 items / 10 deselected / 0 selected 2025-12-04T12:52:45.8518280Z stepcurrent: skipping 10 already run items. 2025-12-04T12:52:45.8518324Z Running 0 items in this shard 2025-12-04T12:52:45.8518345Z 2025-12-04T12:52:45.8518577Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-16355a924f1fdd43.xml - 2025-12-04T12:52:45.8518637Z ============================ 10 deselected in 0.00s ============================ 2025-12-04T12:52:45.8520533Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda'] 2025-12-04T12:52:45.8520564Z 2025-12-04T12:52:45.8520748Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm 1/1 (test/test-reports/distributed.fsdp.test_fsdp_comm_1.1_3b36b42e6bf366b5_.log) 2025-12-04T12:52:45.8520750Z 2025-12-04T12:52:45.8520871Z Finished distributed/fsdp/test_fsdp_comm 1/1 ... [2025-12-04 12:52:45.735130][2292264.38431054], took 5.86min 2025-12-04T12:52:45.8521133Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:52:45.8521220Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:52:45.8521316Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:52:45.8521364Z Uploading artifacts took 0.00 seconds 2025-12-04T12:52:45.8521418Z distributed/fsdp/test_fsdp_comm 1/1 failed! 2025-12-04T12:52:45.8521517Z Running distributed/test_c10d_pypg 1/1 ... [2025-12-04 12:52:45.738585][2292264.387768717] 2025-12-04T12:52:45.8521566Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:52:45.8521872Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_pypg.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:52:45.738767] 2025-12-04T12:52:53.0138946Z 2025-12-04T12:52:53.0140916Z distributed/test_c10d_pypg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_pypg_1.1_55969333d4e99855_.log 2025-12-04T12:52:53.0156648Z Running 48 items in this shard: test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_dataclass_output, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_dataclass_output_unused_param, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_static_graph_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_unused_params_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_invoke_work_object, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_no_init_sync, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_with_pypg, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_with_pypg_with_grad_views, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_invalid_powerSGD_state, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_sync_batch_norm_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_sync_batch_norm_only_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_dataclass_output, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_dataclass_output_unused_param, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_static_graph_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_unused_params_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_invoke_work_object, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_no_init_sync, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_with_pypg, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_with_pypg_with_grad_views, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_invalid_powerSGD_state, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_sync_batch_norm_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_sync_batch_norm_only_empty_input, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_abort_shutdown, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_attr_overrides, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_block_current_stream, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_block_current_stream_use_after_free 2025-12-04T12:52:53.0166793Z 2025-12-04T12:52:53.0166942Z Finished distributed/test_c10d_pypg 1/1 ... [2025-12-04 12:52:53.013480][2292271.662659355], took 0.12min 2025-12-04T12:52:53.0167478Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:52:53.0175019Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:52:53.0177900Z Running distributed/test_pg_wrapper 1/1 ... [2025-12-04 12:52:53.017679][2292271.666862421] 2025-12-04T12:52:53.0178110Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:52:53.0179993Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_pg_wrapper.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:52:53.017867] 2025-12-04T12:54:30.9402113Z 2025-12-04T12:54:30.9403404Z distributed/test_pg_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_pg_wrapper_1.1_148baf11d75dd18e_.log 2025-12-04T12:54:30.9410309Z Running 17 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_coalescing_manager_debug_mode_detail, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_hang, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_detail, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_off, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_debug_level_detail_no_gloo, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_new_group_no_gloo, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_hang, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode_off, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_debug_mode 2025-12-04T12:54:30.9416197Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_coalescing_manager_debug_mode_detail 2025-12-04T12:54:30.9416983Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_hang 2025-12-04T12:54:30.9417712Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_detail 2025-12-04T12:54:30.9418519Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_off 2025-12-04T12:54:30.9419061Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch 2025-12-04T12:54:30.9419717Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch_debug_mode 2025-12-04T12:54:30.9420244Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_debug_level_detail_no_gloo 2025-12-04T12:54:30.9420742Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_new_group_no_gloo 2025-12-04T12:54:30.9421214Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_hang 2025-12-04T12:54:30.9421845Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda 2025-12-04T12:54:30.9422401Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda_debug_mode 2025-12-04T12:54:30.9422965Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode 2025-12-04T12:54:30.9423533Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode_off 2025-12-04T12:54:30.9424078Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch 2025-12-04T12:54:30.9424593Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda 2025-12-04T12:54:30.9425143Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda_debug_mode 2025-12-04T12:54:30.9425692Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_debug_mode 2025-12-04T12:54:30.9425989Z 2025-12-04T12:54:30.9426167Z Finished distributed/test_pg_wrapper 1/1 ... [2025-12-04 12:54:30.940123][2292369.589302929], took 1.63min 2025-12-04T12:54:30.9426770Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:54:30.9432808Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:54:30.9436242Z Running distributed/tensor/test_utils 1/1 ... [2025-12-04 12:54:30.943527][2292369.592711054] 2025-12-04T12:54:30.9436536Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:54:30.9437992Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:54:30.943713] 2025-12-04T12:55:34.2038837Z 2025-12-04T12:55:34.2042500Z distributed/tensor/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_utils_1.1_aedfc19f01c5775f_.log 2025-12-04T12:55:34.2048218Z Running 24 items in this shard: test/distributed/tensor/test_utils.py::LocalTest::test_compute_local_shape_and_global_offset_uneven, test/distributed/tensor/test_utils.py::UtilTest::test_compute_global_tensor_shape_1D, test/distributed/tensor/test_utils.py::UtilTest::test_compute_global_tensor_shape_1D_invalid_shape, test/distributed/tensor/test_utils.py::UtilTest::test_compute_global_tensor_shape_failure_2D, test/distributed/tensor/test_utils.py::UtilTest::test_compute_local_shape_and_global_offset_1D, test/distributed/tensor/test_utils.py::UtilTest::test_compute_local_shape_and_global_offset_2D, test/distributed/tensor/test_utils.py::UtilTest::test_compute_local_shape_and_global_offset_3D, test/distributed/tensor/test_utils.py::UtilTest::test_compute_local_shape_and_global_offset_4D, test/distributed/tensor/test_utils.py::UtilTest::test_fsdp_tp_meta_compute, test/distributed/tensor/test_utils.py::UtilTest::test_hsdp_tp_meta_compute, test/distributed/tensor/test_utils.py::UtilTest::test_uneven_fsdp_tp_meta_compute, test/distributed/tensor/test_utils.py::UtilSingleDeviceTest::test_compute_global_tensor_info_non_shard_placements, test/distributed/tensor/test_utils.py::UtilSingleDeviceTest::test_compute_global_tensor_info_shard_placement, test/distributed/tensor/test_utils.py::UtilSingleDeviceTest::test_compute_global_tensor_info_unsupported_placement, test/distributed/tensor/test_utils.py::UtilSingleDeviceTest::test_compute_tensor_info, test/distributed/tensor/test_utils.py::TestStridedSharding::test_1d_mesh_strided_sharding, test/distributed/tensor/test_utils.py::TestStridedSharding::test_2d_mesh_2d_tensor_strided_sharding, test/distributed/tensor/test_utils.py::TestStridedSharding::test_2d_mesh_strided_sharding, test/distributed/tensor/test_utils.py::TestStridedSharding::test_2d_mesh_uneven_strided_shard, test/distributed/tensor/test_utils.py::Test_StridedShard_with_shard_order::test_StridedShard_not_convertible_to_shard_order, test/distributed/tensor/test_utils.py::Test_StridedShard_with_shard_order::test_StridedShard_to_shard_order, test/distributed/tensor/test_utils.py::Test2DStridedLocalShard::test_fsdp1_tp_2d_dtensor_local_shards_and_offsets, test/distributed/tensor/test_utils.py::Test2DStridedLocalShard::test_fsdp2_tp_2d_dtensor_local_shards_and_offsets, test/distributed/tensor/test_utils.py::TestExplicitRedistribute::test_explicit_matmul 2025-12-04T12:55:34.2053162Z 2025-12-04T12:55:34.2053319Z Finished distributed/tensor/test_utils 1/1 ... [2025-12-04 12:55:34.203631][2292432.852811298], took 1.05min 2025-12-04T12:55:34.2054975Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:55:34.2071131Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:55:34.2074200Z Running distributed/fsdp/test_fsdp_unshard_params 1/1 ... [2025-12-04 12:55:34.207349][2292432.856532648] 2025-12-04T12:55:34.2074431Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:55:34.2076413Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_unshard_params.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:55:34.207530] 2025-12-04T12:56:39.8231544Z 2025-12-04T12:56:39.8233011Z distributed/fsdp/test_fsdp_unshard_params 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_unshard_params_1.1_339d2f7e4cf208e0_.log 2025-12-04T12:56:39.8240801Z Running 15 items in this shard: test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_named_parameters_and_buffers, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_param_data, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_recurse, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_respects_reshard, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_writeback, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_singleton_param_writeback, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_submodule, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_with_grads_core, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_with_grads_none_grads, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsNoShard::test_unshard_params_param_data_no_shard, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsNoShard::test_unshard_params_writeback_no_shard, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_offload_to_cpu_no_shard_raises, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_rank0_only_with_writeback_raises, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_unshard_params_from_backward_raises, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_unshard_params_from_forward_raises 2025-12-04T12:56:39.8246373Z 2025-12-04T12:56:39.8252576Z Finished distributed/fsdp/test_fsdp_unshard_params 1/1 ... [2025-12-04 12:56:39.822765][2292498.471944791], took 1.09min 2025-12-04T12:56:39.8253893Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:56:39.8268578Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:56:39.8271146Z Running distributed/checkpoint/test_state_dict_utils 1/1 ... [2025-12-04 12:56:39.826980][2292498.476163733] 2025-12-04T12:56:39.8271516Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:56:39.8273070Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_state_dict_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:56:39.827156] 2025-12-04T12:57:14.9447883Z 2025-12-04T12:57:14.9449283Z distributed/checkpoint/test_state_dict_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_state_dict_utils_1.1_b968ab5788bde42f_.log 2025-12-04T12:57:14.9452128Z Running 7 items in this shard: test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_complicated_dict, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_cpu_and_ranks_only, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_cpu_offload_for_dtensor, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_create_cpu_state_dict, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_gather_state_dict_dtensor, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_gather_with_cpu_and_ranks_only, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_state_dict_util_distribute_tensors 2025-12-04T12:57:14.9454309Z 2025-12-04T12:57:14.9454596Z Finished distributed/checkpoint/test_state_dict_utils 1/1 ... [2025-12-04 12:57:14.944476][2292533.593655859], took 0.59min 2025-12-04T12:57:14.9467953Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:57:14.9486448Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:57:14.9487624Z Running distributed/_shard/sharded_tensor/ops/test_init 1/1 ... [2025-12-04 12:57:14.948606][2292533.597789934] 2025-12-04T12:57:14.9487942Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:57:14.9490919Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_init.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:57:14.948799] 2025-12-04T12:57:31.9881897Z 2025-12-04T12:57:31.9883172Z distributed/_shard/sharded_tensor/ops/test_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_init_1.1_103df0e7967870d8_.log 2025-12-04T12:57:31.9884689Z Running 3 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_init.py::TestShardedTensorNNInit::test_init_sharded_tensor_with_kaiming_uniform, test/distributed/_shard/sharded_tensor/ops/test_init.py::TestShardedTensorNNInit::test_init_sharded_tensor_with_normal, test/distributed/_shard/sharded_tensor/ops/test_init.py::TestShardedTensorNNInit::test_init_sharded_tensor_with_uniform 2025-12-04T12:57:31.9885681Z 2025-12-04T12:57:31.9885928Z Finished distributed/_shard/sharded_tensor/ops/test_init 1/1 ... [2025-12-04 12:57:31.987876][2292550.637055577], took 0.28min 2025-12-04T12:57:31.9902619Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:57:31.9917672Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:57:31.9920337Z Running distributed/_shard/sharded_tensor/ops/test_embedding 1/1 ... [2025-12-04 12:57:31.991868][2292550.641051445] 2025-12-04T12:57:31.9920802Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:57:31.9922839Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:57:31.992054] 2025-12-04T12:57:44.7764390Z 2025-12-04T12:57:44.7765757Z distributed/_shard/sharded_tensor/ops/test_embedding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_embedding_1.1_0331a6abc537409d_.log 2025-12-04T12:57:44.7767436Z Running 2 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_embedding.py::TestShardedEmbedding::test_sharded_embedding_colwise, test/distributed/_shard/sharded_tensor/ops/test_embedding.py::TestShardedEmbedding::test_sharded_embedding_rowwise 2025-12-04T12:57:44.7768649Z 2025-12-04T12:57:44.7769024Z Finished distributed/_shard/sharded_tensor/ops/test_embedding 1/1 ... [2025-12-04 12:57:44.776143][2292563.425322493], took 0.21min 2025-12-04T12:57:44.7786086Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:57:44.7801675Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:57:44.7804843Z Running distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 ... [2025-12-04 12:57:44.780319][2292563.429502837] 2025-12-04T12:57:44.7805257Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:57:44.7806326Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding_bag.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:57:44.780501] 2025-12-04T12:57:57.4637269Z 2025-12-04T12:57:57.4638735Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_embedding_bag_1.1_878df039e5b1d3c0_.log 2025-12-04T12:57:57.4640165Z Running 2 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_embedding_bag.py::TestShardedEmbeddingBag::test_sharded_embedding_bag_colwise, test/distributed/_shard/sharded_tensor/ops/test_embedding_bag.py::TestShardedEmbeddingBag::test_sharded_embedding_bag_rowwise 2025-12-04T12:57:57.4641092Z 2025-12-04T12:57:57.4642029Z Finished distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 ... [2025-12-04 12:57:57.463458][2292576.112637949], took 0.21min 2025-12-04T12:57:57.4658215Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:57:57.4674803Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:57:57.4678752Z Running distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 ... [2025-12-04 12:57:57.467635][2292576.116818674] 2025-12-04T12:57:57.4679397Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:57:57.4680488Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:57:57.467823] 2025-12-04T12:58:09.5999100Z 2025-12-04T12:58:09.6001354Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.test_sharded_tensor_reshard_1.1_2ef61b254586826d_.log 2025-12-04T12:58:09.6003648Z Running 2 items in this shard: test/distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py::TestReshard::test_sharded_tensor_reshard, test/distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py::TestReshard::test_sharded_tensor_reshard_errors 2025-12-04T12:58:09.6004699Z 2025-12-04T12:58:09.6005168Z Finished distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 ... [2025-12-04 12:58:09.599546][2292588.248725117], took 0.20min 2025-12-04T12:58:09.6020728Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T12:58:09.6037665Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:58:09.6041164Z Running distributed/fsdp/test_fsdp_core 1/3 ... [2025-12-04 12:58:09.603951][2292588.253134818] 2025-12-04T12:58:09.6041533Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:58:09.6042751Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_core.py', '--shard-id=1', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:58:09.604140] 2025-12-04T13:21:31.3005767Z 2025-12-04T13:21:31.3015002Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_core 1/3 (test/test-reports/distributed.fsdp.test_fsdp_core_1.3_b5bdac945a318f3b_.log) 2025-12-04T13:21:31.3015708Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4f6a1b3360576c80.xml 2025-12-04T13:21:31.3016122Z ============================= test session starts ============================== 2025-12-04T13:21:31.3016413Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3016669Z cachedir: .pytest_cache 2025-12-04T13:21:31.3016960Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3017276Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3017441Z configfile: pytest.ini 2025-12-04T13:21:31.3017732Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3018041Z collecting ... collected 60 items 2025-12-04T13:21:31.3018275Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T13:21:31.3022917Z Running 19 items in this shard: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda, test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.3026571Z 2025-12-04T13:21:31.3026897Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda I1204 12:58:11.346000 529288 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 529357 2025-12-04T13:21:31.3027410Z I1204 12:58:11.346000 529288 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 529358 2025-12-04T13:21:31.3027761Z I1204 12:58:11.347000 529288 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 529359 2025-12-04T13:21:31.3028109Z I1204 12:58:11.348000 529288 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 529360 2025-12-04T13:21:31.3028724Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3029180Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3029779Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3030396Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3030898Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3031347Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3031949Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3032549Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3033010Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3033456Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3033915Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3034385Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3034954Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3035541Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3036127Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3036709Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3038097Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3039585Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3041071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3042509Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3043955Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3045429Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3046862Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3048325Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3048630Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3048973Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3049492Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3049993Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3050519Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3050982Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3051466Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3051948Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3052425Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3052887Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3053352Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3053866Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3054399Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3054912Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3055646Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136. 2025-12-04T13:21:31.3056283Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3056692Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3071705Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3072401Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3072799Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3073253Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3073509Z dist init r=2, world=4 2025-12-04T13:21:31.3073736Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3074108Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3074639Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3075133Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3075670Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3076133Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3076592Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3077093Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3077567Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3078038Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3078631Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3079192Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3079663Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3080138Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3080818Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232. 2025-12-04T13:21:31.3081469Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3081838Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3082523Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3083038Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3083411Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3083886Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3084140Z dist init r=0, world=4 2025-12-04T13:21:31.3084354Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3084703Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3085223Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3085755Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3086315Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3086810Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3087343Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3087883Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3089845Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3090360Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3090875Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3091408Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3091867Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3092338Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3093010Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488. 2025-12-04T13:21:31.3093637Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3094040Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3094692Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3095212Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3095676Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3096107Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3096357Z dist init r=3, world=4 2025-12-04T13:21:31.3096569Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3096956Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3097526Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3098013Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3098550Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3099018Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3099467Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3099983Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3100459Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3100928Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3101391Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3101843Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3102294Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3102760Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3103420Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352. 2025-12-04T13:21:31.3104041Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3104390Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3104983Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3105485Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3105848Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3106284Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3106523Z dist init r=1, world=4 2025-12-04T13:21:31.3106939Z [rank0]:[W1204 12:58:18.472218625 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3107350Z FAILED [9.2138s] [ 5%] 2025-12-04T13:21:31.3107416Z 2025-12-04T13:21:31.3107479Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3107672Z ___ TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda ____ 2025-12-04T13:21:31.3107852Z Traceback (most recent call last): 2025-12-04T13:21:31.3108100Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3108388Z self._join_processes(fn) 2025-12-04T13:21:31.3108634Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3108919Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3109203Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3109477Z raise RuntimeError(error) 2025-12-04T13:21:31.3109630Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.3109792Z Traceback (most recent call last): 2025-12-04T13:21:31.3110032Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3110273Z getattr(self, test_name)() 2025-12-04T13:21:31.3110509Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3110745Z fn() 2025-12-04T13:21:31.3110956Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3111201Z method(*args, **kwargs) 2025-12-04T13:21:31.3111433Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3111667Z method(*args, **kwargs) 2025-12-04T13:21:31.3111884Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3112112Z with policy(): 2025-12-04T13:21:31.3112326Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3112562Z raise RuntimeError(msg) 2025-12-04T13:21:31.3112986Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136. 2025-12-04T13:21:31.3113371Z 2025-12-04T13:21:31.3113445Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3113790Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3114061Z 2025-12-04T13:21:31.3114151Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3114276Z 2025-12-04T13:21:31.3114278Z 2025-12-04T13:21:31.3114357Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3114566Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3114950Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4f6a1b3360576c80.xml - 2025-12-04T13:21:31.3115292Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3115647Z FAILED [9.2138s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.3115982Z Traceback (most recent call last): 2025-12-04T13:21:31.3116238Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3116495Z getattr(self, test_name)() 2025-12-04T13:21:31.3116737Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3116979Z fn() 2025-12-04T13:21:31.3117190Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3117434Z method(*args, **kwargs) 2025-12-04T13:21:31.3117664Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3117933Z method(*args, **kwargs) 2025-12-04T13:21:31.3118193Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3118443Z with policy(): 2025-12-04T13:21:31.3118653Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3118886Z raise RuntimeError(msg) 2025-12-04T13:21:31.3119314Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136. 2025-12-04T13:21:31.3119701Z 2025-12-04T13:21:31.3119776Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3120120Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3120390Z 2025-12-04T13:21:31.3120480Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3120666Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3120827Z ============================== 1 failed in 9.35s =============================== 2025-12-04T13:21:31.3120964Z Got exit code 1 2025-12-04T13:21:31.3121071Z Retrying single test... 2025-12-04T13:21:31.3121333Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0680ba892e5781e1.xml 2025-12-04T13:21:31.3121621Z ============================= test session starts ============================== 2025-12-04T13:21:31.3121837Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3122030Z cachedir: .pytest_cache 2025-12-04T13:21:31.3122262Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3122507Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3122630Z configfile: pytest.ini 2025-12-04T13:21:31.3122862Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3123142Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3123478Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3123780Z Running 1 items in this shard 2025-12-04T13:21:31.3123851Z 2025-12-04T13:21:31.3124180Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda I1204 12:58:23.038000 529690 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 529759 2025-12-04T13:21:31.3124682Z I1204 12:58:23.039000 529690 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 529760 2025-12-04T13:21:31.3125028Z I1204 12:58:23.039000 529690 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 529761 2025-12-04T13:21:31.3125374Z I1204 12:58:23.040000 529690 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 529762 2025-12-04T13:21:31.3125932Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3126377Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3126833Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3127304Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3127887Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3128525Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3129112Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3129695Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3130155Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3130601Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3131182Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3131769Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3132222Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3132661Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3133255Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3133845Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3135233Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3136679Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3138130Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3139599Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3141042Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3142462Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3143905Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3145324Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3145635Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3145989Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3146503Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3147011Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3147511Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3147969Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3148468Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3148949Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3149426Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3149899Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3150363Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3150814Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3151276Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3151751Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3152424Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232. 2025-12-04T13:21:31.3153055Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3153429Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3154029Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3154551Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3154923Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3155349Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3155600Z dist init r=0, world=4 2025-12-04T13:21:31.3155815Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3156178Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3156701Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3157188Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3157677Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3158134Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3158624Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3159097Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3159568Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3160039Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3160508Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3160961Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3161417Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3161894Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3162580Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352. 2025-12-04T13:21:31.3163211Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3163570Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3164168Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3164681Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3165055Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3165497Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3165762Z dist init r=1, world=4 2025-12-04T13:21:31.3165999Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3166344Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3166839Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3167328Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3167813Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3168314Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3168768Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3169241Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3169714Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3170186Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3170655Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3171118Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3171576Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3172039Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3172726Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488. 2025-12-04T13:21:31.3173345Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3173695Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3174283Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3174788Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3175193Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3175619Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3175858Z dist init r=3, world=4 2025-12-04T13:21:31.3176064Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3176402Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3176887Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3177367Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3177845Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3178332Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3178770Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3179232Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3179695Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3180163Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3180624Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3181075Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3181543Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3182011Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3182673Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136. 2025-12-04T13:21:31.3183291Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3183642Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3184242Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3184777Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3185140Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3185553Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3185792Z dist init r=2, world=4 2025-12-04T13:21:31.3186195Z [rank0]:[W1204 12:58:30.157725631 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3186606Z FAILED [9.1131s] [100%] 2025-12-04T13:21:31.3186673Z 2025-12-04T13:21:31.3186731Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3186922Z ___ TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda ____ 2025-12-04T13:21:31.3187101Z Traceback (most recent call last): 2025-12-04T13:21:31.3187346Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3187590Z self._join_processes(fn) 2025-12-04T13:21:31.3187835Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3188101Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3188421Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3188681Z raise RuntimeError(error) 2025-12-04T13:21:31.3188834Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3188996Z Traceback (most recent call last): 2025-12-04T13:21:31.3189237Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3189479Z getattr(self, test_name)() 2025-12-04T13:21:31.3189710Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3189941Z fn() 2025-12-04T13:21:31.3190143Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3190375Z method(*args, **kwargs) 2025-12-04T13:21:31.3190613Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3190845Z method(*args, **kwargs) 2025-12-04T13:21:31.3191064Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3191292Z with policy(): 2025-12-04T13:21:31.3191503Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3191736Z raise RuntimeError(msg) 2025-12-04T13:21:31.3192155Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232. 2025-12-04T13:21:31.3192539Z 2025-12-04T13:21:31.3192615Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3192969Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3193252Z 2025-12-04T13:21:31.3193357Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3193481Z 2025-12-04T13:21:31.3193483Z 2025-12-04T13:21:31.3193563Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3193765Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3194120Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0680ba892e5781e1.xml - 2025-12-04T13:21:31.3194447Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3194791Z FAILED [9.1131s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3195117Z Traceback (most recent call last): 2025-12-04T13:21:31.3195363Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3195606Z getattr(self, test_name)() 2025-12-04T13:21:31.3195837Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3196067Z fn() 2025-12-04T13:21:31.3196270Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3196499Z method(*args, **kwargs) 2025-12-04T13:21:31.3196716Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3196946Z method(*args, **kwargs) 2025-12-04T13:21:31.3197163Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3197391Z with policy(): 2025-12-04T13:21:31.3197602Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3197836Z raise RuntimeError(msg) 2025-12-04T13:21:31.3198299Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232. 2025-12-04T13:21:31.3198679Z 2025-12-04T13:21:31.3198755Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3199110Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3199376Z 2025-12-04T13:21:31.3199464Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3199651Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3199817Z ======================= 1 failed, 18 deselected in 9.25s ======================= 2025-12-04T13:21:31.3199956Z Got exit code 1 2025-12-04T13:21:31.3200056Z Retrying single test... 2025-12-04T13:21:31.3200309Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9ec3b60899159e6a.xml 2025-12-04T13:21:31.3200596Z ============================= test session starts ============================== 2025-12-04T13:21:31.3200808Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3200999Z cachedir: .pytest_cache 2025-12-04T13:21:31.3201225Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3201462Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3201625Z configfile: pytest.ini 2025-12-04T13:21:31.3201851Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3202145Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3202474Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3202769Z Running 1 items in this shard 2025-12-04T13:21:31.3202846Z 2025-12-04T13:21:31.3203152Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda I1204 12:58:34.484000 530092 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 530161 2025-12-04T13:21:31.3203640Z I1204 12:58:34.485000 530092 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 530162 2025-12-04T13:21:31.3203989Z I1204 12:58:34.485000 530092 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 530163 2025-12-04T13:21:31.3204335Z I1204 12:58:34.486000 530092 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 530164 2025-12-04T13:21:31.3204883Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3205323Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3205906Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3206491Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3206944Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3207380Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3207968Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3208588Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3209037Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3209482Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3210047Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3210628Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3211090Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3211550Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3212123Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3212702Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3214069Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3215488Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3216932Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3218398Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3219829Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3221252Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3222687Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3224092Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3224395Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3224739Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3225235Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3225716Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3226195Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3226644Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3227085Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3227551Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3228027Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3228532Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3228993Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3229443Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3229896Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3230387Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3231062Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488. 2025-12-04T13:21:31.3231700Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3232053Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3232646Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3233152Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3233517Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3233930Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3234169Z dist init r=3, world=4 2025-12-04T13:21:31.3234372Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3234709Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3235197Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3235676Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3236152Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3236596Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3237051Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3237515Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3237979Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3238480Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3238943Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3239392Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3239863Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3240354Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3241015Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136. 2025-12-04T13:21:31.3241635Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3241983Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3242570Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3243082Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3243446Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3243862Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3244105Z dist init r=2, world=4 2025-12-04T13:21:31.3244308Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3244647Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3245132Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3245609Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3246108Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3246556Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3246999Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3247461Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3247922Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3248430Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3248909Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3249388Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3249843Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3250305Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3250966Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232. 2025-12-04T13:21:31.3251587Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3251939Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3252526Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3253027Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3253390Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3253801Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3254042Z dist init r=0, world=4 2025-12-04T13:21:31.3254243Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3254577Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3255059Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3255549Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3256026Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3256475Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3256916Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3257377Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3257845Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3258372Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3258848Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3259296Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3259747Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3260213Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3260872Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352. 2025-12-04T13:21:31.3261492Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3261839Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3262424Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3262927Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3263290Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3263704Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3263944Z dist init r=1, world=4 2025-12-04T13:21:31.3264343Z [rank0]:[W1204 12:58:42.015866417 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3264764Z FAILED [9.4145s] [100%] 2025-12-04T13:21:31.3264828Z 2025-12-04T13:21:31.3264888Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3265082Z ___ TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda ____ 2025-12-04T13:21:31.3265261Z Traceback (most recent call last): 2025-12-04T13:21:31.3265505Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3265749Z self._join_processes(fn) 2025-12-04T13:21:31.3265993Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3266258Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3266526Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3266785Z raise RuntimeError(error) 2025-12-04T13:21:31.3266942Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3267104Z Traceback (most recent call last): 2025-12-04T13:21:31.3267376Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3267635Z getattr(self, test_name)() 2025-12-04T13:21:31.3267865Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3268096Z fn() 2025-12-04T13:21:31.3268332Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3268563Z method(*args, **kwargs) 2025-12-04T13:21:31.3268783Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3269014Z method(*args, **kwargs) 2025-12-04T13:21:31.3269297Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3269527Z with policy(): 2025-12-04T13:21:31.3269740Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3269974Z raise RuntimeError(msg) 2025-12-04T13:21:31.3270391Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488. 2025-12-04T13:21:31.3270772Z 2025-12-04T13:21:31.3270849Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3271188Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3271452Z 2025-12-04T13:21:31.3271543Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3271670Z 2025-12-04T13:21:31.3271672Z 2025-12-04T13:21:31.3271752Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3271955Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3272311Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9ec3b60899159e6a.xml - 2025-12-04T13:21:31.3272641Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3272987Z FAILED [9.4145s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3273335Z Traceback (most recent call last): 2025-12-04T13:21:31.3273581Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3273827Z getattr(self, test_name)() 2025-12-04T13:21:31.3274058Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3274290Z fn() 2025-12-04T13:21:31.3274491Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3274720Z method(*args, **kwargs) 2025-12-04T13:21:31.3274940Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3275167Z method(*args, **kwargs) 2025-12-04T13:21:31.3275384Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3275613Z with policy(): 2025-12-04T13:21:31.3275824Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3276090Z raise RuntimeError(msg) 2025-12-04T13:21:31.3276509Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488. 2025-12-04T13:21:31.3276916Z 2025-12-04T13:21:31.3276992Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3277333Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3277597Z 2025-12-04T13:21:31.3277686Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3277874Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3278042Z ======================= 1 failed, 18 deselected in 9.55s ======================= 2025-12-04T13:21:31.3278248Z Got exit code 1 2025-12-04T13:21:31.3278486Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda 2025-12-04T13:21:31.3278821Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.3279174Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-dbdb1831962e97ea.xml 2025-12-04T13:21:31.3279457Z ============================= test session starts ============================== 2025-12-04T13:21:31.3279671Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3279899Z cachedir: .pytest_cache 2025-12-04T13:21:31.3280125Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3280366Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3280485Z configfile: pytest.ini 2025-12-04T13:21:31.3280712Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3280985Z collecting ... collected 60 items / 1 deselected / 59 selected 2025-12-04T13:21:31.3281148Z stepcurrent: skipping 1 already run items. 2025-12-04T13:21:31.3281279Z Running 18 items in this shard 2025-12-04T13:21:31.3281353Z 2025-12-04T13:21:31.3281661Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda I1204 12:58:46.474000 530494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 530563 2025-12-04T13:21:31.3282173Z I1204 12:58:46.475000 530494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 530564 2025-12-04T13:21:31.3282523Z I1204 12:58:46.476000 530494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 530565 2025-12-04T13:21:31.3282863Z I1204 12:58:46.476000 530494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 530566 2025-12-04T13:21:31.3283412Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3283851Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3284286Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3284720Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3285309Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3285926Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3286509Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3287091Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3287547Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3287989Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3288594Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3289175Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3289625Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3290063Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3290631Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3291211Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3291450Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3291811Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3292305Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3292787Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3293264Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3293716Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3294172Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3294652Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3295136Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3295598Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3296064Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3296519Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3296977Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3297445Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3298111Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T13:21:31.3298771Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3299124Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3299717Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3300224Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3300591Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3301023Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3301267Z dist init r=1, world=4 2025-12-04T13:21:31.3301471Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3301811Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3302304Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3302784Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3303265Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3303725Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3304198Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3304665Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3305129Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3305596Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3330563Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3331080Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3331554Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3332033Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3332714Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T13:21:31.3333358Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3333723Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3334324Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3334891Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3335270Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3335696Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3335946Z dist init r=3, world=4 2025-12-04T13:21:31.3336161Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3336507Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3337005Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3337491Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3338005Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3338525Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3338972Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3339443Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3339917Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3340388Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3340856Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3341314Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3341777Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3342250Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3342922Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T13:21:31.3343555Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3343911Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3344520Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3345031Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3345404Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3345826Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3346070Z dist init r=2, world=4 2025-12-04T13:21:31.3346277Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3346619Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3347128Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3347647Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3348130Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3348628Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3349077Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3349554Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3350027Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3350494Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3350958Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3351413Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3351875Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3352352Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3353018Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:21:31.3353662Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3354016Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3354606Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3355115Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3355485Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3355908Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3356154Z dist init r=0, world=4 2025-12-04T13:21:31.3356581Z [rank0]:[W1204 12:58:54.844170899 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3357026Z FAILED [9.4156s] [ 5%] 2025-12-04T13:21:31.3357092Z 2025-12-04T13:21:31.3357158Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3357354Z ____ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda ____ 2025-12-04T13:21:31.3357537Z Traceback (most recent call last): 2025-12-04T13:21:31.3357791Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3358042Z self._join_processes(fn) 2025-12-04T13:21:31.3358353Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3358626Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3358906Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3359170Z raise RuntimeError(error) 2025-12-04T13:21:31.3359328Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3359496Z Traceback (most recent call last): 2025-12-04T13:21:31.3359745Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3359993Z getattr(self, test_name)() 2025-12-04T13:21:31.3360230Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3360466Z fn() 2025-12-04T13:21:31.3360675Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3360912Z method(*args, **kwargs) 2025-12-04T13:21:31.3361144Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3361379Z method(*args, **kwargs) 2025-12-04T13:21:31.3361602Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3361836Z with policy(): 2025-12-04T13:21:31.3362054Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3362292Z raise RuntimeError(msg) 2025-12-04T13:21:31.3362736Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T13:21:31.3363119Z 2025-12-04T13:21:31.3363202Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3363547Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3363813Z 2025-12-04T13:21:31.3363909Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3364036Z 2025-12-04T13:21:31.3364305Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3364452Z Traceback (most recent call last): 2025-12-04T13:21:31.3364702Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3364949Z getattr(self, test_name)() 2025-12-04T13:21:31.3365187Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3365425Z fn() 2025-12-04T13:21:31.3365648Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3365923Z method(*args, **kwargs) 2025-12-04T13:21:31.3366147Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3366383Z method(*args, **kwargs) 2025-12-04T13:21:31.3366606Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3366839Z with policy(): 2025-12-04T13:21:31.3367056Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3367293Z raise RuntimeError(msg) 2025-12-04T13:21:31.3367721Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T13:21:31.3368107Z 2025-12-04T13:21:31.3368222Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3368567Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3368833Z 2025-12-04T13:21:31.3368923Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3369053Z 2025-12-04T13:21:31.3369055Z 2025-12-04T13:21:31.3369135Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3369340Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3369703Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-dbdb1831962e97ea.xml - 2025-12-04T13:21:31.3370037Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3370386Z FAILED [9.4156s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3370712Z Traceback (most recent call last): 2025-12-04T13:21:31.3370966Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3371217Z getattr(self, test_name)() 2025-12-04T13:21:31.3371455Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3371692Z fn() 2025-12-04T13:21:31.3371914Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3372151Z method(*args, **kwargs) 2025-12-04T13:21:31.3372378Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3372613Z method(*args, **kwargs) 2025-12-04T13:21:31.3372836Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3373067Z with policy(): 2025-12-04T13:21:31.3373285Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3373519Z raise RuntimeError(msg) 2025-12-04T13:21:31.3373943Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T13:21:31.3374329Z 2025-12-04T13:21:31.3374425Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3374782Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3375059Z 2025-12-04T13:21:31.3375154Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3375279Z 2025-12-04T13:21:31.3375340Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3375483Z Traceback (most recent call last): 2025-12-04T13:21:31.3375732Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3375978Z getattr(self, test_name)() 2025-12-04T13:21:31.3376212Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3376448Z fn() 2025-12-04T13:21:31.3376656Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3376893Z method(*args, **kwargs) 2025-12-04T13:21:31.3377116Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3377349Z method(*args, **kwargs) 2025-12-04T13:21:31.3377571Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3377800Z with policy(): 2025-12-04T13:21:31.3378015Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3378294Z raise RuntimeError(msg) 2025-12-04T13:21:31.3378718Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T13:21:31.3379101Z 2025-12-04T13:21:31.3379182Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3379522Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3379783Z 2025-12-04T13:21:31.3379876Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3380068Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3380238Z ======================= 1 failed, 1 deselected in 9.56s ======================== 2025-12-04T13:21:31.3380382Z Got exit code 1 2025-12-04T13:21:31.3380485Z Retrying single test... 2025-12-04T13:21:31.3380766Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4bd3df182e000d22.xml 2025-12-04T13:21:31.3381059Z ============================= test session starts ============================== 2025-12-04T13:21:31.3381279Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3381471Z cachedir: .pytest_cache 2025-12-04T13:21:31.3381701Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3381943Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3382068Z configfile: pytest.ini 2025-12-04T13:21:31.3382303Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3382582Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3382918Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3383249Z Running 1 items in this shard 2025-12-04T13:21:31.3383328Z 2025-12-04T13:21:31.3383648Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda I1204 12:58:58.235000 530896 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 530965 2025-12-04T13:21:31.3384144Z I1204 12:58:58.235000 530896 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 530966 2025-12-04T13:21:31.3384492Z I1204 12:58:58.236000 530896 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 530967 2025-12-04T13:21:31.3384836Z I1204 12:58:58.237000 530896 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 530968 2025-12-04T13:21:31.3385393Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3385840Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3386423Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3387021Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3387476Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3387916Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3388520Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3389104Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3389552Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3390003Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3390569Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3391150Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3391596Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3392033Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3392623Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3393229Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3393467Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3393810Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3394298Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3394777Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3395257Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3395706Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3396146Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3396609Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3397072Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3397536Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3397997Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3398494Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3398949Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3399426Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3400089Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:21:31.3400711Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3401057Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3401643Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3402187Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3402565Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3402979Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3403218Z dist init r=0, world=4 2025-12-04T13:21:31.3403421Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3403758Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3404246Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3404726Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3405201Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3405649Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3406087Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3406550Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3407012Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3407473Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3407934Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3408433Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3408890Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3409356Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3410016Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T13:21:31.3410632Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3410980Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3411595Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3412110Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3412471Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3412882Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3413122Z dist init r=3, world=4 2025-12-04T13:21:31.3413324Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3413663Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3414147Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3414627Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3415103Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3415549Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3415988Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3416449Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3416909Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3417372Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3417845Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3418334Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3418790Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3419252Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3419909Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T13:21:31.3420554Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3420914Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3421496Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3421996Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3422358Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3422771Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3423010Z dist init r=2, world=4 2025-12-04T13:21:31.3423211Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3423547Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3424031Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3424508Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3424985Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3425432Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3425869Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3426329Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3426804Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3427265Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3427725Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3428224Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3428680Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3429143Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3429815Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T13:21:31.3430458Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3430804Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3431389Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3431887Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3432249Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3432660Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3432896Z dist init r=1, world=4 2025-12-04T13:21:31.3433296Z [rank0]:[W1204 12:59:06.961221808 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3433707Z FAILED [9.7141s] [100%] 2025-12-04T13:21:31.3433773Z 2025-12-04T13:21:31.3433832Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3434023Z ____ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda ____ 2025-12-04T13:21:31.3434199Z Traceback (most recent call last): 2025-12-04T13:21:31.3434444Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3434685Z self._join_processes(fn) 2025-12-04T13:21:31.3434930Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3435193Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3435458Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3435715Z raise RuntimeError(error) 2025-12-04T13:21:31.3435888Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3436048Z Traceback (most recent call last): 2025-12-04T13:21:31.3436286Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3436526Z getattr(self, test_name)() 2025-12-04T13:21:31.3436756Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3436984Z fn() 2025-12-04T13:21:31.3437185Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3437414Z method(*args, **kwargs) 2025-12-04T13:21:31.3437634Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3437861Z method(*args, **kwargs) 2025-12-04T13:21:31.3438078Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3438336Z with policy(): 2025-12-04T13:21:31.3438567Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3438828Z raise RuntimeError(msg) 2025-12-04T13:21:31.3439242Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:21:31.3439624Z 2025-12-04T13:21:31.3439698Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3440035Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3440297Z 2025-12-04T13:21:31.3440386Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3440509Z 2025-12-04T13:21:31.3440511Z 2025-12-04T13:21:31.3440594Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3440795Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3441150Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4bd3df182e000d22.xml - 2025-12-04T13:21:31.3441477Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3441819Z FAILED [9.7141s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3442139Z Traceback (most recent call last): 2025-12-04T13:21:31.3442388Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3442628Z getattr(self, test_name)() 2025-12-04T13:21:31.3442862Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3443094Z fn() 2025-12-04T13:21:31.3443295Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3443523Z method(*args, **kwargs) 2025-12-04T13:21:31.3443740Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3443969Z method(*args, **kwargs) 2025-12-04T13:21:31.3444185Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3444409Z with policy(): 2025-12-04T13:21:31.3444634Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3444863Z raise RuntimeError(msg) 2025-12-04T13:21:31.3445282Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:21:31.3445660Z 2025-12-04T13:21:31.3445735Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3446069Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3446329Z 2025-12-04T13:21:31.3446416Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3446603Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3446765Z ======================= 1 failed, 18 deselected in 9.85s ======================= 2025-12-04T13:21:31.3446913Z Got exit code 1 2025-12-04T13:21:31.3447020Z Retrying single test... 2025-12-04T13:21:31.3447288Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-97bb0ef2ed351f4f.xml 2025-12-04T13:21:31.3447568Z ============================= test session starts ============================== 2025-12-04T13:21:31.3447777Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3447961Z cachedir: .pytest_cache 2025-12-04T13:21:31.3448214Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3448450Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3448566Z configfile: pytest.ini 2025-12-04T13:21:31.3448791Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3449063Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3449391Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3449685Z Running 1 items in this shard 2025-12-04T13:21:31.3449758Z 2025-12-04T13:21:31.3450060Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda I1204 12:59:10.585000 531298 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 531367 2025-12-04T13:21:31.3450546Z I1204 12:59:10.586000 531298 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 531368 2025-12-04T13:21:31.3450887Z I1204 12:59:10.587000 531298 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 531369 2025-12-04T13:21:31.3451226Z I1204 12:59:10.588000 531298 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 531370 2025-12-04T13:21:31.3451778Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3452225Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3452805Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3453404Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3453856Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3454293Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3454867Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3455449Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3455914Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3456362Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3456943Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3457523Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3457969Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.3458435Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.3459004Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3459585Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3459825Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3460171Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3460665Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3461145Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3461623Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3462071Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3462524Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3462989Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3463456Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3463919Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3464380Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3464829Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3465298Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3465783Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3466466Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:21:31.3467088Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3467438Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3468026Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3468559Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3468923Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3469338Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3469579Z dist init r=0, world=4 2025-12-04T13:21:31.3469786Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3470123Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3470608Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3471085Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3471560Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3472026Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3472465Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3472931Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3473394Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3473854Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3474317Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3474785Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3475265Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3475730Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3476390Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T13:21:31.3477016Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3477366Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3477952Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3478585Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3478953Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3479368Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3479609Z dist init r=2, world=4 2025-12-04T13:21:31.3479813Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3480151Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3480640Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3481118Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3481615Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3482064Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3482502Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3482967Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3483430Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3483909Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3484389Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3484859Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3485315Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3485783Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3486444Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T13:21:31.3487066Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3487416Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3488001Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3488117Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3488370Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3488539Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3488578Z dist init r=3, world=4 2025-12-04T13:21:31.3488719Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3488879Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3489181Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3489336Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3489625Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3489750Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3490030Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3490180Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3490475Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3490651Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3490931Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3491070Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3491351Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3491501Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3491976Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T13:21:31.3492091Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3492289Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3492643Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3492760Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3492974Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3493140Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3493181Z dist init r=1, world=4 2025-12-04T13:21:31.3493532Z [rank0]:[W1204 12:59:18.416892830 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3493576Z FAILED [9.8144s] [100%] 2025-12-04T13:21:31.3493578Z 2025-12-04T13:21:31.3493637Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3493734Z ____ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda ____ 2025-12-04T13:21:31.3493781Z Traceback (most recent call last): 2025-12-04T13:21:31.3493946Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3493990Z self._join_processes(fn) 2025-12-04T13:21:31.3494165Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3494220Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3494400Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3494456Z raise RuntimeError(error) 2025-12-04T13:21:31.3494555Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3494616Z Traceback (most recent call last): 2025-12-04T13:21:31.3494780Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3494824Z getattr(self, test_name)() 2025-12-04T13:21:31.3494982Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3495018Z fn() 2025-12-04T13:21:31.3495170Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3495212Z method(*args, **kwargs) 2025-12-04T13:21:31.3495365Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3495407Z method(*args, **kwargs) 2025-12-04T13:21:31.3495561Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3495601Z with policy(): 2025-12-04T13:21:31.3495753Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3495795Z raise RuntimeError(msg) 2025-12-04T13:21:31.3496142Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:21:31.3496145Z 2025-12-04T13:21:31.3496224Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3496452Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3496457Z 2025-12-04T13:21:31.3496546Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3496548Z 2025-12-04T13:21:31.3496550Z 2025-12-04T13:21:31.3496628Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3496716Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3496954Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-97bb0ef2ed351f4f.xml - 2025-12-04T13:21:31.3497016Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3497271Z FAILED [9.8144s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3497319Z Traceback (most recent call last): 2025-12-04T13:21:31.3497486Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3497530Z getattr(self, test_name)() 2025-12-04T13:21:31.3497693Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3497727Z fn() 2025-12-04T13:21:31.3497881Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3497922Z method(*args, **kwargs) 2025-12-04T13:21:31.3498075Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3498115Z method(*args, **kwargs) 2025-12-04T13:21:31.3498310Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3498380Z with policy(): 2025-12-04T13:21:31.3498533Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3498590Z raise RuntimeError(msg) 2025-12-04T13:21:31.3498937Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:21:31.3498940Z 2025-12-04T13:21:31.3499016Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3499244Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3499246Z 2025-12-04T13:21:31.3499335Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3499400Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3499465Z ======================= 1 failed, 18 deselected in 9.95s ======================= 2025-12-04T13:21:31.3499502Z Got exit code 1 2025-12-04T13:21:31.3499679Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:21:31.3499807Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.3500002Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c3d64beaed0e8212.xml 2025-12-04T13:21:31.3500064Z ============================= test session starts ============================== 2025-12-04T13:21:31.3500178Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3500221Z cachedir: .pytest_cache 2025-12-04T13:21:31.3500382Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3500431Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3500472Z configfile: pytest.ini 2025-12-04T13:21:31.3500637Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3500712Z collecting ... collected 60 items / 2 deselected / 58 selected 2025-12-04T13:21:31.3500768Z stepcurrent: skipping 2 already run items. 2025-12-04T13:21:31.3500812Z Running 17 items in this shard 2025-12-04T13:21:31.3500814Z 2025-12-04T13:21:31.3501134Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda I1204 12:59:23.001000 531700 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 531769 2025-12-04T13:21:31.3501291Z I1204 12:59:23.002000 531700 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 531770 2025-12-04T13:21:31.3501448Z I1204 12:59:23.002000 531700 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 531771 2025-12-04T13:21:31.3501599Z I1204 12:59:23.003000 531700 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 531772 2025-12-04T13:21:31.3502189Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3502229Z _warn_cpu_init() 2025-12-04T13:21:31.3502815Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3502877Z _warn_cpu_init() 2025-12-04T13:21:31.3503447Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3503488Z _warn_cpu_init() 2025-12-04T13:21:31.3504055Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3504092Z _warn_cpu_init() 2025-12-04T13:21:31.3504387Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3504431Z return func(*args, **kwargs) 2025-12-04T13:21:31.3504579Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3504744Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3505036Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3505192Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3505490Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3505617Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3505897Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3506048Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3506325Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3506474Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3506762Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3506912Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3507209Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3507358Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3507838Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3507957Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3508207Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3508567Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3508682Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3508897Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3509064Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3509107Z dist init r=0, world=4 2025-12-04T13:21:31.3509247Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3509409Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3509699Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3509870Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3510158Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3510284Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3510564Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3510712Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3510990Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3511156Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3511458Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3511597Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3511876Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3512027Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3512503Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:21:31.3512621Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3512816Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3513173Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3513290Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3513502Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3513670Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3513810Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3513972Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3514270Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3514428Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3514716Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3514840Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3515116Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3515265Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3515553Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3515721Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3515998Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3516136Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3516416Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3516566Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3517041Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3517158Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3517355Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3517711Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3517827Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3518038Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3518244Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3518283Z dist init r=1, world=4 2025-12-04T13:21:31.3518323Z dist init r=2, world=4 2025-12-04T13:21:31.3518477Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3518642Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3518932Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3519089Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3519374Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3519501Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3519794Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3519970Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3520246Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3520393Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3520674Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3520813Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3521094Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3521245Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3521719Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:21:31.3521836Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3522034Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3522390Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3522505Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3522727Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3522895Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3522935Z dist init r=3, world=4 2025-12-04T13:21:31.3523280Z [rank0]:[W1204 12:59:54.080538617 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3523323Z FAILED [33.1317s] [ 5%] 2025-12-04T13:21:31.3523325Z 2025-12-04T13:21:31.3523384Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3523486Z ____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda _____ 2025-12-04T13:21:31.3523534Z Traceback (most recent call last): 2025-12-04T13:21:31.3523699Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3523745Z self._join_processes(fn) 2025-12-04T13:21:31.3523929Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3524008Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3524186Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3524232Z raise RuntimeError(error) 2025-12-04T13:21:31.3524314Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3524361Z Traceback (most recent call last): 2025-12-04T13:21:31.3524525Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3524568Z getattr(self, test_name)() 2025-12-04T13:21:31.3524730Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3524765Z fn() 2025-12-04T13:21:31.3524920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3524962Z method(*args, **kwargs) 2025-12-04T13:21:31.3525117Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3525157Z method(*args, **kwargs) 2025-12-04T13:21:31.3525310Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3525347Z with policy(): 2025-12-04T13:21:31.3525502Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3525543Z raise RuntimeError(msg) 2025-12-04T13:21:31.3525896Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3525900Z 2025-12-04T13:21:31.3525976Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3526207Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3526210Z 2025-12-04T13:21:31.3526300Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3526302Z 2025-12-04T13:21:31.3526363Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.3526411Z Traceback (most recent call last): 2025-12-04T13:21:31.3526591Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3526635Z getattr(self, test_name)() 2025-12-04T13:21:31.3526796Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3526836Z fn() 2025-12-04T13:21:31.3526987Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3527029Z method(*args, **kwargs) 2025-12-04T13:21:31.3527180Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3527222Z method(*args, **kwargs) 2025-12-04T13:21:31.3527373Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3527411Z with policy(): 2025-12-04T13:21:31.3527566Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3527609Z raise RuntimeError(msg) 2025-12-04T13:21:31.3527968Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3527991Z 2025-12-04T13:21:31.3528071Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3528346Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3528348Z 2025-12-04T13:21:31.3528435Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3528437Z 2025-12-04T13:21:31.3528439Z 2025-12-04T13:21:31.3528519Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3528608Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3528847Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c3d64beaed0e8212.xml - 2025-12-04T13:21:31.3528910Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3529158Z FAILED [33.1317s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3529204Z Traceback (most recent call last): 2025-12-04T13:21:31.3529370Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3529413Z getattr(self, test_name)() 2025-12-04T13:21:31.3529575Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3529610Z fn() 2025-12-04T13:21:31.3529777Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3529823Z method(*args, **kwargs) 2025-12-04T13:21:31.3529975Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3530018Z method(*args, **kwargs) 2025-12-04T13:21:31.3530169Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3530208Z with policy(): 2025-12-04T13:21:31.3530360Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3530402Z raise RuntimeError(msg) 2025-12-04T13:21:31.3530785Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3530789Z 2025-12-04T13:21:31.3530865Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3531094Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3531096Z 2025-12-04T13:21:31.3531183Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3531186Z 2025-12-04T13:21:31.3531245Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.3531291Z Traceback (most recent call last): 2025-12-04T13:21:31.3531456Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3531499Z getattr(self, test_name)() 2025-12-04T13:21:31.3531675Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3531739Z fn() 2025-12-04T13:21:31.3531892Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3531933Z method(*args, **kwargs) 2025-12-04T13:21:31.3532085Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3532124Z method(*args, **kwargs) 2025-12-04T13:21:31.3532277Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3532314Z with policy(): 2025-12-04T13:21:31.3532468Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3532510Z raise RuntimeError(msg) 2025-12-04T13:21:31.3532860Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3532864Z 2025-12-04T13:21:31.3532938Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3533167Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3533170Z 2025-12-04T13:21:31.3533257Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3533323Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3533389Z ======================= 1 failed, 2 deselected in 33.27s ======================= 2025-12-04T13:21:31.3533426Z Got exit code 1 2025-12-04T13:21:31.3533469Z Retrying single test... 2025-12-04T13:21:31.3533662Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0e3e8cedde9f2a88.xml 2025-12-04T13:21:31.3533724Z ============================= test session starts ============================== 2025-12-04T13:21:31.3533838Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3533882Z cachedir: .pytest_cache 2025-12-04T13:21:31.3534043Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3534091Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3534132Z configfile: pytest.ini 2025-12-04T13:21:31.3534308Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3534385Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3534611Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3534655Z Running 1 items in this shard 2025-12-04T13:21:31.3534657Z 2025-12-04T13:21:31.3534961Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda I1204 12:59:58.440000 532102 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 532171 2025-12-04T13:21:31.3535115Z I1204 12:59:58.441000 532102 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 532172 2025-12-04T13:21:31.3535269Z I1204 12:59:58.441000 532102 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 532173 2025-12-04T13:21:31.3535424Z I1204 12:59:58.442000 532102 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 532174 2025-12-04T13:21:31.3536026Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3536075Z _warn_cpu_init() 2025-12-04T13:21:31.3536643Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3536682Z _warn_cpu_init() 2025-12-04T13:21:31.3537253Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3537290Z _warn_cpu_init() 2025-12-04T13:21:31.3537858Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3537897Z _warn_cpu_init() 2025-12-04T13:21:31.3538232Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3538276Z return func(*args, **kwargs) 2025-12-04T13:21:31.3538421Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3538586Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3538891Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3539053Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3539341Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3539469Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3539748Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3539900Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3540191Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3540368Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3540645Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3540783Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3541065Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3541214Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3541699Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:21:31.3541816Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3542013Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3542372Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3542489Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3542702Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3542866Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3542908Z dist init r=3, world=4 2025-12-04T13:21:31.3543057Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3543220Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3543510Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3543666Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3543957Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3544083Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3544371Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3544540Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3544817Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3544966Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3545243Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3545383Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3545661Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3545811Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3546291Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3546408Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3546607Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3546963Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3547079Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3547302Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3547468Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3547509Z dist init r=2, world=4 2025-12-04T13:21:31.3547648Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3547809Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3548097Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3548312Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3548613Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3548751Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3549042Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3549193Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3549470Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3549621Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3549900Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3550038Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3550315Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3550464Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3550944Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:21:31.3551060Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3551258Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3551613Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3551740Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3551954Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3552119Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3552161Z dist init r=1, world=4 2025-12-04T13:21:31.3552298Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3552459Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3552747Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3552913Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3553226Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3553349Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3553628Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3553777Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3554055Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3554205Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3554483Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3554623Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3554902Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3555055Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3555532Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3555649Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3555846Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3556213Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3556332Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3556542Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3556709Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3556748Z dist init r=0, world=4 2025-12-04T13:21:31.3557087Z [rank0]:[W1204 13:00:29.506401813 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3557149Z FAILED [33.0345s] [100%] 2025-12-04T13:21:31.3557168Z 2025-12-04T13:21:31.3557227Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3557340Z ____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda _____ 2025-12-04T13:21:31.3557388Z Traceback (most recent call last): 2025-12-04T13:21:31.3557554Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3557599Z self._join_processes(fn) 2025-12-04T13:21:31.3557774Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3557828Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3558009Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3558053Z raise RuntimeError(error) 2025-12-04T13:21:31.3558138Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3558208Z Traceback (most recent call last): 2025-12-04T13:21:31.3558371Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3558413Z getattr(self, test_name)() 2025-12-04T13:21:31.3558572Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3558606Z fn() 2025-12-04T13:21:31.3558759Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3558800Z method(*args, **kwargs) 2025-12-04T13:21:31.3558957Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3558997Z method(*args, **kwargs) 2025-12-04T13:21:31.3559150Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3559189Z with policy(): 2025-12-04T13:21:31.3559344Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3559385Z raise RuntimeError(msg) 2025-12-04T13:21:31.3559739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:21:31.3559741Z 2025-12-04T13:21:31.3559819Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3560069Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3560073Z 2025-12-04T13:21:31.3560165Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3560168Z 2025-12-04T13:21:31.3560170Z 2025-12-04T13:21:31.3560245Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3560335Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3560571Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0e3e8cedde9f2a88.xml - 2025-12-04T13:21:31.3560633Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3560880Z FAILED [33.0345s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3560928Z Traceback (most recent call last): 2025-12-04T13:21:31.3561107Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3561182Z getattr(self, test_name)() 2025-12-04T13:21:31.3561344Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3561379Z fn() 2025-12-04T13:21:31.3561534Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3561575Z method(*args, **kwargs) 2025-12-04T13:21:31.3561728Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3561768Z method(*args, **kwargs) 2025-12-04T13:21:31.3561922Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3561959Z with policy(): 2025-12-04T13:21:31.3562116Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3562159Z raise RuntimeError(msg) 2025-12-04T13:21:31.3562511Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:21:31.3562513Z 2025-12-04T13:21:31.3562587Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3562817Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3562820Z 2025-12-04T13:21:31.3562908Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3562975Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3563041Z ====================== 1 failed, 18 deselected in 33.17s ======================= 2025-12-04T13:21:31.3563081Z Got exit code 1 2025-12-04T13:21:31.3563122Z Retrying single test... 2025-12-04T13:21:31.3563314Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-85422e17b079f439.xml 2025-12-04T13:21:31.3563374Z ============================= test session starts ============================== 2025-12-04T13:21:31.3563487Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3563529Z cachedir: .pytest_cache 2025-12-04T13:21:31.3563698Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3563745Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3563786Z configfile: pytest.ini 2025-12-04T13:21:31.3563952Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3564027Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3564253Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3564297Z Running 1 items in this shard 2025-12-04T13:21:31.3564299Z 2025-12-04T13:21:31.3564608Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda I1204 13:00:33.883000 532504 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 532573 2025-12-04T13:21:31.3564765Z I1204 13:00:33.883000 532504 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 532574 2025-12-04T13:21:31.3564932Z I1204 13:00:33.884000 532504 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 532575 2025-12-04T13:21:31.3565105Z I1204 13:00:33.884000 532504 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 532576 2025-12-04T13:21:31.3565683Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3565723Z _warn_cpu_init() 2025-12-04T13:21:31.3566291Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3566333Z _warn_cpu_init() 2025-12-04T13:21:31.3566902Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3566941Z _warn_cpu_init() 2025-12-04T13:21:31.3567505Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3567544Z _warn_cpu_init() 2025-12-04T13:21:31.3567836Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3567879Z return func(*args, **kwargs) 2025-12-04T13:21:31.3568035Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3568243Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3568536Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3568695Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3568984Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3569111Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3569409Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3569585Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3569865Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3570015Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3570293Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3570432Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3570713Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3570862Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3571343Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3571462Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3571659Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3572018Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3572134Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3572347Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3572524Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3572568Z dist init r=0, world=4 2025-12-04T13:21:31.3572707Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3572870Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3587739Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3587926Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3588282Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3588479Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3588782Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3588935Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3589213Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3589363Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3589640Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3589781Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3590060Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3590208Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3590693Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:21:31.3590814Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3591013Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3591375Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3591504Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3591721Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3591889Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3591931Z dist init r=3, world=4 2025-12-04T13:21:31.3592072Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3592232Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3592521Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3592687Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3592984Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3593122Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3593400Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3593549Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3593830Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3593978Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3594255Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3594392Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3594670Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3594819Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3595298Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:21:31.3595415Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3595611Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3595981Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3596099Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3596312Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3596477Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3596517Z dist init r=1, world=4 2025-12-04T13:21:31.3596656Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3596816Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3597116Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3597298Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3597584Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3597708Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3597985Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3598136Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3598461Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3598610Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3598885Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3599023Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3599301Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3599451Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3599926Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3600059Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3600256Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3600612Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3600727Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3600941Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3601106Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3601145Z dist init r=2, world=4 2025-12-04T13:21:31.3601497Z [rank0]:[W1204 13:01:05.081495291 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3601580Z FAILED [33.2375s] [100%] 2025-12-04T13:21:31.3601583Z 2025-12-04T13:21:31.3601643Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3601746Z ____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda _____ 2025-12-04T13:21:31.3601793Z Traceback (most recent call last): 2025-12-04T13:21:31.3601961Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3602005Z self._join_processes(fn) 2025-12-04T13:21:31.3602181Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3602236Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3602416Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3602461Z raise RuntimeError(error) 2025-12-04T13:21:31.3602544Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3602590Z Traceback (most recent call last): 2025-12-04T13:21:31.3602752Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3602795Z getattr(self, test_name)() 2025-12-04T13:21:31.3602954Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3602989Z fn() 2025-12-04T13:21:31.3603143Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3603186Z method(*args, **kwargs) 2025-12-04T13:21:31.3603339Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3603382Z method(*args, **kwargs) 2025-12-04T13:21:31.3603532Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3603569Z with policy(): 2025-12-04T13:21:31.3603720Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3603761Z raise RuntimeError(msg) 2025-12-04T13:21:31.3604124Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3604128Z 2025-12-04T13:21:31.3604207Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3604438Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3604442Z 2025-12-04T13:21:31.3604532Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3604535Z 2025-12-04T13:21:31.3604596Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3604642Z Traceback (most recent call last): 2025-12-04T13:21:31.3604807Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3604848Z getattr(self, test_name)() 2025-12-04T13:21:31.3605010Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3605044Z fn() 2025-12-04T13:21:31.3605208Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3605269Z method(*args, **kwargs) 2025-12-04T13:21:31.3605421Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3605461Z method(*args, **kwargs) 2025-12-04T13:21:31.3605611Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3605647Z with policy(): 2025-12-04T13:21:31.3605800Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3605839Z raise RuntimeError(msg) 2025-12-04T13:21:31.3606191Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:21:31.3606195Z 2025-12-04T13:21:31.3606271Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3606499Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3606502Z 2025-12-04T13:21:31.3606591Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3606593Z 2025-12-04T13:21:31.3606595Z 2025-12-04T13:21:31.3606672Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3606762Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3606997Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-85422e17b079f439.xml - 2025-12-04T13:21:31.3607061Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3607311Z FAILED [33.2375s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3607359Z Traceback (most recent call last): 2025-12-04T13:21:31.3607523Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3607566Z getattr(self, test_name)() 2025-12-04T13:21:31.3607728Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3607761Z fn() 2025-12-04T13:21:31.3607926Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3607965Z method(*args, **kwargs) 2025-12-04T13:21:31.3608119Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3608202Z method(*args, **kwargs) 2025-12-04T13:21:31.3608353Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3608389Z with policy(): 2025-12-04T13:21:31.3608542Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3608581Z raise RuntimeError(msg) 2025-12-04T13:21:31.3608934Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3608936Z 2025-12-04T13:21:31.3609027Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3609270Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3609444Z 2025-12-04T13:21:31.3609531Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3609534Z 2025-12-04T13:21:31.3609592Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3609637Z Traceback (most recent call last): 2025-12-04T13:21:31.3609798Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3609840Z getattr(self, test_name)() 2025-12-04T13:21:31.3610000Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3610034Z fn() 2025-12-04T13:21:31.3610185Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3610226Z method(*args, **kwargs) 2025-12-04T13:21:31.3610376Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3610417Z method(*args, **kwargs) 2025-12-04T13:21:31.3610566Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3610604Z with policy(): 2025-12-04T13:21:31.3610754Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3610794Z raise RuntimeError(msg) 2025-12-04T13:21:31.3611143Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:21:31.3611146Z 2025-12-04T13:21:31.3611221Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3611448Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3611451Z 2025-12-04T13:21:31.3611537Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3611603Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3611666Z ====================== 1 failed, 18 deselected in 33.38s ======================= 2025-12-04T13:21:31.3611704Z Got exit code 1 2025-12-04T13:21:31.3611893Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda 2025-12-04T13:21:31.3612023Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.3612214Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3c8429d2d3d8f75c.xml 2025-12-04T13:21:31.3612274Z ============================= test session starts ============================== 2025-12-04T13:21:31.3612390Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3612432Z cachedir: .pytest_cache 2025-12-04T13:21:31.3612591Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3612639Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3612680Z configfile: pytest.ini 2025-12-04T13:21:31.3612846Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3612930Z collecting ... collected 60 items / 3 deselected / 57 selected 2025-12-04T13:21:31.3612994Z stepcurrent: skipping 3 already run items. 2025-12-04T13:21:31.3613047Z Running 16 items in this shard 2025-12-04T13:21:31.3613049Z 2025-12-04T13:21:31.3613369Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda I1204 13:01:09.649000 532906 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 532975 2025-12-04T13:21:31.3613525Z I1204 13:01:09.650000 532906 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 532976 2025-12-04T13:21:31.3613677Z I1204 13:01:09.651000 532906 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 532977 2025-12-04T13:21:31.3613830Z I1204 13:01:09.652000 532906 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 532978 2025-12-04T13:21:31.3614416Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3614458Z _warn_cpu_init() 2025-12-04T13:21:31.3615025Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3615063Z _warn_cpu_init() 2025-12-04T13:21:31.3615633Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3615670Z _warn_cpu_init() 2025-12-04T13:21:31.3616247Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3616284Z _warn_cpu_init() 2025-12-04T13:21:31.3616579Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3616623Z return func(*args, **kwargs) 2025-12-04T13:21:31.3616768Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3616931Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3617221Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3617386Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3617696Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3617823Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3618100Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3618284Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3618563Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3618712Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3618990Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3619128Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3619407Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3619557Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3620046Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3620162Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3620370Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3620740Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3620856Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3621070Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3621234Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3621274Z dist init r=2, world=4 2025-12-04T13:21:31.3621417Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3621589Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3621887Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3622052Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3622336Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3622461Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3622739Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3622888Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3623164Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3623311Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3623588Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3623726Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3624004Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3624154Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3624649Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:21:31.3624763Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3624960Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3625326Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3625441Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3625654Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3625819Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3625876Z dist init r=3, world=4 2025-12-04T13:21:31.3626016Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3626187Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3626474Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3626628Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3626912Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3627038Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3627315Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3627461Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3627737Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3627885Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3628201Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3628337Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3628616Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3628763Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3629261Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3629378Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3629573Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3629936Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3630050Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3630274Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3630461Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3630500Z dist init r=0, world=4 2025-12-04T13:21:31.3630637Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3630798Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3631086Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3631239Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3631524Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3631646Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3631922Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3632069Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3632345Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3632494Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3632769Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3632905Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3633193Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3633342Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3633826Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:21:31.3633940Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3634137Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3634517Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3634652Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3634862Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3635026Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3635063Z dist init r=1, world=4 2025-12-04T13:21:31.3635407Z [rank0]:[W1204 13:01:41.978118501 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3635449Z FAILED [33.2346s] [ 6%] 2025-12-04T13:21:31.3635452Z 2025-12-04T13:21:31.3635510Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3635618Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda _ 2025-12-04T13:21:31.3635663Z Traceback (most recent call last): 2025-12-04T13:21:31.3635828Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3635871Z self._join_processes(fn) 2025-12-04T13:21:31.3636044Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3636097Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3636279Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3636323Z raise RuntimeError(error) 2025-12-04T13:21:31.3636405Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.3636450Z Traceback (most recent call last): 2025-12-04T13:21:31.3636610Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3636652Z getattr(self, test_name)() 2025-12-04T13:21:31.3636810Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3636844Z fn() 2025-12-04T13:21:31.3636996Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3637036Z method(*args, **kwargs) 2025-12-04T13:21:31.3637197Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3637237Z method(*args, **kwargs) 2025-12-04T13:21:31.3637390Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3637428Z with policy(): 2025-12-04T13:21:31.3637581Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3637621Z raise RuntimeError(msg) 2025-12-04T13:21:31.3637981Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3637984Z 2025-12-04T13:21:31.3638061Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3638363Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3638381Z 2025-12-04T13:21:31.3638481Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3638483Z 2025-12-04T13:21:31.3638485Z 2025-12-04T13:21:31.3638561Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3638651Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3638885Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3c8429d2d3d8f75c.xml - 2025-12-04T13:21:31.3638947Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3639203Z FAILED [33.2346s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.3639250Z Traceback (most recent call last): 2025-12-04T13:21:31.3639415Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3639457Z getattr(self, test_name)() 2025-12-04T13:21:31.3639618Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3639652Z fn() 2025-12-04T13:21:31.3639804Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3639843Z method(*args, **kwargs) 2025-12-04T13:21:31.3639994Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3640035Z method(*args, **kwargs) 2025-12-04T13:21:31.3640186Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3640224Z with policy(): 2025-12-04T13:21:31.3640376Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3640418Z raise RuntimeError(msg) 2025-12-04T13:21:31.3640777Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3640779Z 2025-12-04T13:21:31.3640853Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3641105Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3641107Z 2025-12-04T13:21:31.3641199Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3641262Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3641326Z ======================= 1 failed, 3 deselected in 33.37s ======================= 2025-12-04T13:21:31.3641364Z Got exit code 1 2025-12-04T13:21:31.3641404Z Retrying single test... 2025-12-04T13:21:31.3641593Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c20ad9eb622651c0.xml 2025-12-04T13:21:31.3641651Z ============================= test session starts ============================== 2025-12-04T13:21:31.3641764Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3641807Z cachedir: .pytest_cache 2025-12-04T13:21:31.3641964Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3642021Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3642071Z configfile: pytest.ini 2025-12-04T13:21:31.3642248Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3642323Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3642558Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3642601Z Running 1 items in this shard 2025-12-04T13:21:31.3642603Z 2025-12-04T13:21:31.3642922Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda I1204 13:01:45.518000 533308 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 533377 2025-12-04T13:21:31.3643079Z I1204 13:01:45.519000 533308 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 533378 2025-12-04T13:21:31.3643230Z I1204 13:01:45.520000 533308 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 533379 2025-12-04T13:21:31.3643383Z I1204 13:01:45.521000 533308 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 533380 2025-12-04T13:21:31.3643960Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3643999Z _warn_cpu_init() 2025-12-04T13:21:31.3644568Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3644608Z _warn_cpu_init() 2025-12-04T13:21:31.3645185Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3645222Z _warn_cpu_init() 2025-12-04T13:21:31.3645787Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3645826Z _warn_cpu_init() 2025-12-04T13:21:31.3646117Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3646160Z return func(*args, **kwargs) 2025-12-04T13:21:31.3646303Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3646476Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3646785Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3646942Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3647228Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3647354Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3647632Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3647783Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3648062Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3648255Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3648533Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3648672Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3648956Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3649104Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3649611Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3649729Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3649925Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3650294Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3650408Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3650622Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3650800Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3650963Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3651123Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3651412Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3651565Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3651851Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3651977Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3652254Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3652402Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3652679Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3652825Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3653102Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3653238Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3653519Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3653666Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3654163Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:21:31.3654280Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3654475Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3654841Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3654955Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3655187Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3655361Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3655400Z dist init r=2, world=4 2025-12-04T13:21:31.3655439Z dist init r=1, world=4 2025-12-04T13:21:31.3655577Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3655737Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3656032Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3656187Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3656471Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3656595Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3656870Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3657019Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3657295Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3657443Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3657719Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3657855Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3658183Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3658333Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3658818Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3658934Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3659129Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3659512Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3659650Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3659861Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3660024Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3660163Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3660324Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3660614Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3660769Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3661054Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3661178Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3661455Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3661605Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3661881Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3662027Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3662316Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3662452Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3662732Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3662881Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3663367Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:21:31.3663481Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3663696Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3664078Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3664191Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3664403Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3664568Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3664608Z dist init r=0, world=4 2025-12-04T13:21:31.3664645Z dist init r=3, world=4 2025-12-04T13:21:31.3664987Z [rank0]:[W1204 13:02:17.953865349 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3665028Z FAILED [33.2347s] [100%] 2025-12-04T13:21:31.3665031Z 2025-12-04T13:21:31.3665087Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3665195Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda _ 2025-12-04T13:21:31.3665240Z Traceback (most recent call last): 2025-12-04T13:21:31.3665405Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3665448Z self._join_processes(fn) 2025-12-04T13:21:31.3665622Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3665677Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3665855Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3665898Z raise RuntimeError(error) 2025-12-04T13:21:31.3665979Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.3666024Z Traceback (most recent call last): 2025-12-04T13:21:31.3666187Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3666229Z getattr(self, test_name)() 2025-12-04T13:21:31.3666399Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3666433Z fn() 2025-12-04T13:21:31.3666587Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3666628Z method(*args, **kwargs) 2025-12-04T13:21:31.3666781Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3666820Z method(*args, **kwargs) 2025-12-04T13:21:31.3666972Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3667007Z with policy(): 2025-12-04T13:21:31.3667160Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3667200Z raise RuntimeError(msg) 2025-12-04T13:21:31.3667573Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3667595Z 2025-12-04T13:21:31.3667671Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3667910Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3667912Z 2025-12-04T13:21:31.3668001Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3668004Z 2025-12-04T13:21:31.3668005Z 2025-12-04T13:21:31.3668080Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3668207Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3668444Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c20ad9eb622651c0.xml - 2025-12-04T13:21:31.3668507Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3668764Z FAILED [33.2347s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.3668810Z Traceback (most recent call last): 2025-12-04T13:21:31.3668974Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3669016Z getattr(self, test_name)() 2025-12-04T13:21:31.3669175Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3669210Z fn() 2025-12-04T13:21:31.3669362Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3669403Z method(*args, **kwargs) 2025-12-04T13:21:31.3669553Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3669595Z method(*args, **kwargs) 2025-12-04T13:21:31.3669747Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3669784Z with policy(): 2025-12-04T13:21:31.3669937Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3669976Z raise RuntimeError(msg) 2025-12-04T13:21:31.3670353Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3670355Z 2025-12-04T13:21:31.3670431Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3670673Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3670675Z 2025-12-04T13:21:31.3670763Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3670826Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3670890Z ====================== 1 failed, 18 deselected in 33.40s ======================= 2025-12-04T13:21:31.3670926Z Got exit code 1 2025-12-04T13:21:31.3670967Z Retrying single test... 2025-12-04T13:21:31.3671157Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-80f6c7f9f6e17155.xml 2025-12-04T13:21:31.3671229Z ============================= test session starts ============================== 2025-12-04T13:21:31.3671354Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3671409Z cachedir: .pytest_cache 2025-12-04T13:21:31.3671568Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3671615Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3671655Z configfile: pytest.ini 2025-12-04T13:21:31.3671820Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3671894Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3672131Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3672173Z Running 1 items in this shard 2025-12-04T13:21:31.3672177Z 2025-12-04T13:21:31.3672495Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda I1204 13:02:21.497000 533710 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 533779 2025-12-04T13:21:31.3672652Z I1204 13:02:21.497000 533710 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 533780 2025-12-04T13:21:31.3672803Z I1204 13:02:21.498000 533710 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 533781 2025-12-04T13:21:31.3672953Z I1204 13:02:21.499000 533710 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 533782 2025-12-04T13:21:31.3673535Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3673575Z _warn_cpu_init() 2025-12-04T13:21:31.3674143Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3674179Z _warn_cpu_init() 2025-12-04T13:21:31.3674757Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3674796Z _warn_cpu_init() 2025-12-04T13:21:31.3675360Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3675397Z _warn_cpu_init() 2025-12-04T13:21:31.3675696Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3676086Z return func(*args, **kwargs) 2025-12-04T13:21:31.3676229Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3676392Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3676680Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3676837Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3677125Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3677254Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3677532Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3677680Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3677957Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3678104Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3678435Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3678572Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3678849Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3679014Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3679502Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3679622Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3679817Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3680187Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3680332Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3680560Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3680724Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3680762Z dist init r=0, world=4 2025-12-04T13:21:31.3680900Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3681059Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3681350Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3681504Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3681792Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3681916Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3682195Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3682344Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3682619Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3682768Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3683043Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3683189Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3683468Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3683618Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3684105Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:21:31.3684220Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3684416Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3684805Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3684930Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3685141Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3685306Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3685347Z dist init r=1, world=4 2025-12-04T13:21:31.3685484Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3685645Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3685933Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3686086Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3686370Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3686496Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3686773Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3686922Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3687197Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3687343Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3687630Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3687766Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3688045Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3688260Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3688745Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:21:31.3688890Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3689098Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3689462Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3689575Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3689787Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3689950Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3689992Z dist init r=3, world=4 2025-12-04T13:21:31.3690131Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3690289Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3690575Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3690730Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3691019Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3691144Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3691421Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3691567Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3691856Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3692004Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3692281Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3692417Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3692693Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3692843Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3693342Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:21:31.3693475Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3693671Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3694036Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3694150Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3694362Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3694527Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3694565Z dist init r=2, world=4 2025-12-04T13:21:31.3694901Z [rank0]:[W1204 13:02:52.636059224 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3694943Z FAILED [33.2358s] [100%] 2025-12-04T13:21:31.3694946Z 2025-12-04T13:21:31.3695001Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3695110Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda _ 2025-12-04T13:21:31.3695157Z Traceback (most recent call last): 2025-12-04T13:21:31.3695321Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3695364Z self._join_processes(fn) 2025-12-04T13:21:31.3695538Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3695592Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3695771Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3695813Z raise RuntimeError(error) 2025-12-04T13:21:31.3695903Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3695948Z Traceback (most recent call last): 2025-12-04T13:21:31.3696111Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3696153Z getattr(self, test_name)() 2025-12-04T13:21:31.3696313Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3696347Z fn() 2025-12-04T13:21:31.3696500Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3696539Z method(*args, **kwargs) 2025-12-04T13:21:31.3696690Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3696729Z method(*args, **kwargs) 2025-12-04T13:21:31.3696881Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3696917Z with policy(): 2025-12-04T13:21:31.3697096Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3697149Z raise RuntimeError(msg) 2025-12-04T13:21:31.3697507Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3697510Z 2025-12-04T13:21:31.3697586Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3697830Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3697833Z 2025-12-04T13:21:31.3697924Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3697928Z 2025-12-04T13:21:31.3697986Z Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3698033Z Traceback (most recent call last): 2025-12-04T13:21:31.3698241Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3698283Z getattr(self, test_name)() 2025-12-04T13:21:31.3698441Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3698476Z fn() 2025-12-04T13:21:31.3698625Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3698665Z method(*args, **kwargs) 2025-12-04T13:21:31.3698816Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3698856Z method(*args, **kwargs) 2025-12-04T13:21:31.3699006Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3699044Z with policy(): 2025-12-04T13:21:31.3699196Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3699236Z raise RuntimeError(msg) 2025-12-04T13:21:31.3699593Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:21:31.3699595Z 2025-12-04T13:21:31.3699667Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3699921Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3699925Z 2025-12-04T13:21:31.3700012Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3700015Z 2025-12-04T13:21:31.3700018Z 2025-12-04T13:21:31.3700092Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3700181Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3700413Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-80f6c7f9f6e17155.xml - 2025-12-04T13:21:31.3700474Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3700728Z FAILED [33.2358s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3700776Z Traceback (most recent call last): 2025-12-04T13:21:31.3700968Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3701023Z getattr(self, test_name)() 2025-12-04T13:21:31.3701182Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3701216Z fn() 2025-12-04T13:21:31.3701368Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3701409Z method(*args, **kwargs) 2025-12-04T13:21:31.3701559Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3701598Z method(*args, **kwargs) 2025-12-04T13:21:31.3701748Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3701786Z with policy(): 2025-12-04T13:21:31.3701938Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3701981Z raise RuntimeError(msg) 2025-12-04T13:21:31.3702344Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:21:31.3702346Z 2025-12-04T13:21:31.3702419Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3702658Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3702661Z 2025-12-04T13:21:31.3702746Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3702750Z 2025-12-04T13:21:31.3702809Z Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3702854Z Traceback (most recent call last): 2025-12-04T13:21:31.3703017Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3703058Z getattr(self, test_name)() 2025-12-04T13:21:31.3703219Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3703252Z fn() 2025-12-04T13:21:31.3703403Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3703441Z method(*args, **kwargs) 2025-12-04T13:21:31.3703601Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3703640Z method(*args, **kwargs) 2025-12-04T13:21:31.3703793Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3703830Z with policy(): 2025-12-04T13:21:31.3703982Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3704023Z raise RuntimeError(msg) 2025-12-04T13:21:31.3704378Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:21:31.3704381Z 2025-12-04T13:21:31.3704456Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3704706Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3704720Z 2025-12-04T13:21:31.3704816Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3704879Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3704943Z ====================== 1 failed, 18 deselected in 33.37s ======================= 2025-12-04T13:21:31.3704979Z Got exit code 1 2025-12-04T13:21:31.3705167Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.3705297Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.3705488Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c97dc3beffec5ac9.xml 2025-12-04T13:21:31.3705547Z ============================= test session starts ============================== 2025-12-04T13:21:31.3705660Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3705702Z cachedir: .pytest_cache 2025-12-04T13:21:31.3705859Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3705905Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3705945Z configfile: pytest.ini 2025-12-04T13:21:31.3706107Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3706181Z collecting ... collected 60 items / 4 deselected / 56 selected 2025-12-04T13:21:31.3706234Z stepcurrent: skipping 4 already run items. 2025-12-04T13:21:31.3706277Z Running 15 items in this shard 2025-12-04T13:21:31.3706279Z 2025-12-04T13:21:31.3706596Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda I1204 13:02:57.279000 534112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 534181 2025-12-04T13:21:31.3706752Z I1204 13:02:57.279000 534112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 534182 2025-12-04T13:21:31.3706906Z I1204 13:02:57.280000 534112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 534183 2025-12-04T13:21:31.3707056Z I1204 13:02:57.281000 534112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 534184 2025-12-04T13:21:31.3707646Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3707686Z _warn_cpu_init() 2025-12-04T13:21:31.3708294Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3708332Z _warn_cpu_init() 2025-12-04T13:21:31.3708916Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3708979Z _warn_cpu_init() 2025-12-04T13:21:31.3709551Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3709586Z _warn_cpu_init() 2025-12-04T13:21:31.3709879Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3709922Z return func(*args, **kwargs) 2025-12-04T13:21:31.3710066Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3710229Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3710540Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3710696Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3710988Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3711115Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3711396Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3711545Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3711823Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3711991Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3712270Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3712409Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3712686Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3712833Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3713329Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3713465Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3713662Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3714033Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3714148Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3714363Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3714527Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3714567Z dist init r=0, world=4 2025-12-04T13:21:31.3714704Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3714865Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3715152Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3715308Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3715595Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3715718Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3715995Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3716151Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3716430Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3716579Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3716855Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3716992Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3717270Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3717430Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3717932Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:21:31.3718048Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3718276Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3718643Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3718760Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3718971Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3719136Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3719175Z dist init r=3, world=4 2025-12-04T13:21:31.3719314Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3719474Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3719763Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3719918Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3720203Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3720328Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3720618Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3720768Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3721045Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3721192Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3721469Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3721625Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3721914Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3722076Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3722560Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3722674Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3722872Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3723238Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3723351Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3723564Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3723727Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3723767Z dist init r=1, world=4 2025-12-04T13:21:31.3723905Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3724066Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3724351Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3724506Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3724800Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3724926Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3725205Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3725352Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3725631Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3725780Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3726066Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3726234Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3726511Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3726660Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3727143Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:21:31.3727260Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3727454Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3727818Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3727933Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3728186Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3728351Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3728390Z dist init r=2, world=4 2025-12-04T13:21:31.3728729Z [rank0]:[W1204 13:03:35.903145783 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3728768Z FAILED [39.8366s] [ 6%] 2025-12-04T13:21:31.3728770Z 2025-12-04T13:21:31.3728828Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3728945Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda _ 2025-12-04T13:21:31.3728992Z Traceback (most recent call last): 2025-12-04T13:21:31.3729157Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3729202Z self._join_processes(fn) 2025-12-04T13:21:31.3729375Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3729430Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3729606Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3729651Z raise RuntimeError(error) 2025-12-04T13:21:31.3729733Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3729777Z Traceback (most recent call last): 2025-12-04T13:21:31.3729940Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3730009Z getattr(self, test_name)() 2025-12-04T13:21:31.3730169Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3730215Z fn() 2025-12-04T13:21:31.3730368Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3730409Z method(*args, **kwargs) 2025-12-04T13:21:31.3730561Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3730600Z method(*args, **kwargs) 2025-12-04T13:21:31.3730751Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3730790Z with policy(): 2025-12-04T13:21:31.3730945Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3730987Z raise RuntimeError(msg) 2025-12-04T13:21:31.3731347Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3731351Z 2025-12-04T13:21:31.3731426Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3731666Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3731668Z 2025-12-04T13:21:31.3731756Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3731760Z 2025-12-04T13:21:31.3731819Z Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3731866Z Traceback (most recent call last): 2025-12-04T13:21:31.3732030Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3732073Z getattr(self, test_name)() 2025-12-04T13:21:31.3732231Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3732266Z fn() 2025-12-04T13:21:31.3732416Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3732457Z method(*args, **kwargs) 2025-12-04T13:21:31.3732606Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3732647Z method(*args, **kwargs) 2025-12-04T13:21:31.3732810Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3732849Z with policy(): 2025-12-04T13:21:31.3733002Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3733045Z raise RuntimeError(msg) 2025-12-04T13:21:31.3733402Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3733405Z 2025-12-04T13:21:31.3733479Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3733720Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3733722Z 2025-12-04T13:21:31.3733809Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3733830Z 2025-12-04T13:21:31.3733890Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3733945Z Traceback (most recent call last): 2025-12-04T13:21:31.3734108Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3734149Z getattr(self, test_name)() 2025-12-04T13:21:31.3734308Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3734340Z fn() 2025-12-04T13:21:31.3734491Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3734530Z method(*args, **kwargs) 2025-12-04T13:21:31.3734682Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3734721Z method(*args, **kwargs) 2025-12-04T13:21:31.3734872Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3734909Z with policy(): 2025-12-04T13:21:31.3735062Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3735102Z raise RuntimeError(msg) 2025-12-04T13:21:31.3735461Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:21:31.3735463Z 2025-12-04T13:21:31.3735537Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3735776Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3735779Z 2025-12-04T13:21:31.3735866Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3735868Z 2025-12-04T13:21:31.3735870Z 2025-12-04T13:21:31.3735945Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3736034Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3736269Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c97dc3beffec5ac9.xml - 2025-12-04T13:21:31.3736332Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3736596Z FAILED [39.8366s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3736645Z Traceback (most recent call last): 2025-12-04T13:21:31.3736812Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3736854Z getattr(self, test_name)() 2025-12-04T13:21:31.3737014Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3737047Z fn() 2025-12-04T13:21:31.3737199Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3737237Z method(*args, **kwargs) 2025-12-04T13:21:31.3737388Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3737428Z method(*args, **kwargs) 2025-12-04T13:21:31.3737578Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3737632Z with policy(): 2025-12-04T13:21:31.3737784Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3737834Z raise RuntimeError(msg) 2025-12-04T13:21:31.3738226Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3738229Z 2025-12-04T13:21:31.3738301Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3738539Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3738541Z 2025-12-04T13:21:31.3738630Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3738633Z 2025-12-04T13:21:31.3738690Z Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3738736Z Traceback (most recent call last): 2025-12-04T13:21:31.3738898Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3738940Z getattr(self, test_name)() 2025-12-04T13:21:31.3739098Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3739132Z fn() 2025-12-04T13:21:31.3739282Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3739322Z method(*args, **kwargs) 2025-12-04T13:21:31.3739472Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3739513Z method(*args, **kwargs) 2025-12-04T13:21:31.3739664Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3739701Z with policy(): 2025-12-04T13:21:31.3739852Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3739894Z raise RuntimeError(msg) 2025-12-04T13:21:31.3740247Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3740251Z 2025-12-04T13:21:31.3740339Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3740577Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3740581Z 2025-12-04T13:21:31.3740668Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3740670Z 2025-12-04T13:21:31.3740729Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3740773Z Traceback (most recent call last): 2025-12-04T13:21:31.3740937Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3740978Z getattr(self, test_name)() 2025-12-04T13:21:31.3741137Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3741170Z fn() 2025-12-04T13:21:31.3741321Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3741360Z method(*args, **kwargs) 2025-12-04T13:21:31.3741543Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3741594Z method(*args, **kwargs) 2025-12-04T13:21:31.3741745Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3741781Z with policy(): 2025-12-04T13:21:31.3741932Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3741973Z raise RuntimeError(msg) 2025-12-04T13:21:31.3742330Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:21:31.3742332Z 2025-12-04T13:21:31.3742407Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3742643Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3742646Z 2025-12-04T13:21:31.3742733Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3742797Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3742861Z ======================= 1 failed, 4 deselected in 39.97s ======================= 2025-12-04T13:21:31.3742897Z Got exit code 1 2025-12-04T13:21:31.3742938Z Retrying single test... 2025-12-04T13:21:31.3743127Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6e9a6375f681e708.xml 2025-12-04T13:21:31.3743185Z ============================= test session starts ============================== 2025-12-04T13:21:31.3743298Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3743342Z cachedir: .pytest_cache 2025-12-04T13:21:31.3743500Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3743548Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3743588Z configfile: pytest.ini 2025-12-04T13:21:31.3743752Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3743827Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3744068Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3744113Z Running 1 items in this shard 2025-12-04T13:21:31.3744115Z 2025-12-04T13:21:31.3744429Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda I1204 13:03:39.615000 534514 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 534583 2025-12-04T13:21:31.3744586Z I1204 13:03:39.616000 534514 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 534584 2025-12-04T13:21:31.3744737Z I1204 13:03:39.617000 534514 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 534585 2025-12-04T13:21:31.3744889Z I1204 13:03:39.618000 534514 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 534586 2025-12-04T13:21:31.3745482Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3745538Z _warn_cpu_init() 2025-12-04T13:21:31.3746103Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3746139Z _warn_cpu_init() 2025-12-04T13:21:31.3746713Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3746751Z _warn_cpu_init() 2025-12-04T13:21:31.3747313Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3747350Z _warn_cpu_init() 2025-12-04T13:21:31.3747645Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3747690Z return func(*args, **kwargs) 2025-12-04T13:21:31.3747833Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3747997Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3748320Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3748488Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3748774Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3748900Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3749180Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3749328Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3749606Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3749766Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3750054Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3750203Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3750481Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3750631Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3751115Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3751234Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3751431Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3751798Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3751913Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3752125Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3752292Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3752330Z dist init r=1, world=4 2025-12-04T13:21:31.3752468Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3752626Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3752923Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3753080Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3753374Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3753509Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3753809Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3753991Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3754323Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3754506Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3754807Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3754954Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3755261Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3755443Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3755940Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3756079Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3756292Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3756681Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3756829Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3757057Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3757243Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3757297Z dist init r=0, world=4 2025-12-04T13:21:31.3758017Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3758215Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3758544Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3758710Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3759023Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3759170Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3759468Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3759689Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3759975Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3760143Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3760444Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3760586Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3760910Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3761071Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3761576Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:21:31.3761702Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3761916Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3762328Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3762452Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3762699Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3762875Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3762940Z dist init r=3, world=4 2025-12-04T13:21:31.3763101Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3763287Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3763600Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3763765Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3764086Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3764252Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3764559Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3764717Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3765024Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3765188Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3765486Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3765651Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3765939Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3766115Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3766608Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:21:31.3766755Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3766979Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3767371Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3767507Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3767731Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3767926Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3767982Z dist init r=2, world=4 2025-12-04T13:21:31.3768382Z [rank0]:[W1204 13:04:17.179298369 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3768446Z FAILED [39.6407s] [100%] 2025-12-04T13:21:31.3768449Z 2025-12-04T13:21:31.3768517Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3768657Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda _ 2025-12-04T13:21:31.3768754Z Traceback (most recent call last): 2025-12-04T13:21:31.3768939Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3769008Z self._join_processes(fn) 2025-12-04T13:21:31.3769204Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3769264Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3769488Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3769544Z raise RuntimeError(error) 2025-12-04T13:21:31.3769648Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3769706Z Traceback (most recent call last): 2025-12-04T13:21:31.3769893Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3769965Z getattr(self, test_name)() 2025-12-04T13:21:31.3770152Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3770198Z fn() 2025-12-04T13:21:31.3770372Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3770436Z method(*args, **kwargs) 2025-12-04T13:21:31.3770623Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3770692Z method(*args, **kwargs) 2025-12-04T13:21:31.3770854Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3770913Z with policy(): 2025-12-04T13:21:31.3771085Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3771156Z raise RuntimeError(msg) 2025-12-04T13:21:31.3771531Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3771533Z 2025-12-04T13:21:31.3771631Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3771882Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3771885Z 2025-12-04T13:21:31.3772015Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3772017Z 2025-12-04T13:21:31.3772105Z Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3772169Z Traceback (most recent call last): 2025-12-04T13:21:31.3772357Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3773643Z getattr(self, test_name)() 2025-12-04T13:21:31.3773832Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3773872Z fn() 2025-12-04T13:21:31.3774061Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3774113Z method(*args, **kwargs) 2025-12-04T13:21:31.3774291Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3774342Z method(*args, **kwargs) 2025-12-04T13:21:31.3774524Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3774595Z with policy(): 2025-12-04T13:21:31.3774791Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3774842Z raise RuntimeError(msg) 2025-12-04T13:21:31.3775220Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3775222Z 2025-12-04T13:21:31.3775317Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3775577Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3775579Z 2025-12-04T13:21:31.3775700Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3775704Z 2025-12-04T13:21:31.3775706Z 2025-12-04T13:21:31.3775791Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3775902Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3776146Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6e9a6375f681e708.xml - 2025-12-04T13:21:31.3776239Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3776517Z FAILED [39.6407s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3776587Z Traceback (most recent call last): 2025-12-04T13:21:31.3776776Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3776832Z getattr(self, test_name)() 2025-12-04T13:21:31.3777022Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3777077Z fn() 2025-12-04T13:21:31.3777251Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3777302Z method(*args, **kwargs) 2025-12-04T13:21:31.3777476Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3777521Z method(*args, **kwargs) 2025-12-04T13:21:31.3777729Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3777778Z with policy(): 2025-12-04T13:21:31.3777956Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3778008Z raise RuntimeError(msg) 2025-12-04T13:21:31.3778659Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3778662Z 2025-12-04T13:21:31.3778785Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3779033Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3779035Z 2025-12-04T13:21:31.3779146Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3779148Z 2025-12-04T13:21:31.3779235Z Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3779317Z Traceback (most recent call last): 2025-12-04T13:21:31.3779516Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3779587Z getattr(self, test_name)() 2025-12-04T13:21:31.3779757Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3779820Z fn() 2025-12-04T13:21:31.3779981Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3780053Z method(*args, **kwargs) 2025-12-04T13:21:31.3780219Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3780282Z method(*args, **kwargs) 2025-12-04T13:21:31.3780449Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3780512Z with policy(): 2025-12-04T13:21:31.3780695Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3780752Z raise RuntimeError(msg) 2025-12-04T13:21:31.3781129Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3781131Z 2025-12-04T13:21:31.3781219Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3781478Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3781481Z 2025-12-04T13:21:31.3781574Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3781679Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3781755Z ====================== 1 failed, 18 deselected in 39.81s ======================= 2025-12-04T13:21:31.3781821Z Got exit code 1 2025-12-04T13:21:31.3781872Z Retrying single test... 2025-12-04T13:21:31.3782085Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ab74cfc34851cb6b.xml 2025-12-04T13:21:31.3782165Z ============================= test session starts ============================== 2025-12-04T13:21:31.3782306Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3782388Z cachedir: .pytest_cache 2025-12-04T13:21:31.3782559Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3782626Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3782692Z configfile: pytest.ini 2025-12-04T13:21:31.3782891Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3782976Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3783232Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3783286Z Running 1 items in this shard 2025-12-04T13:21:31.3783289Z 2025-12-04T13:21:31.3783640Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda I1204 13:04:22.030000 534916 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 534985 2025-12-04T13:21:31.3783831Z I1204 13:04:22.030000 534916 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 534986 2025-12-04T13:21:31.3784029Z I1204 13:04:22.031000 534916 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 534987 2025-12-04T13:21:31.3784202Z I1204 13:04:22.032000 534916 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 534988 2025-12-04T13:21:31.3784795Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3784869Z _warn_cpu_init() 2025-12-04T13:21:31.3785457Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3785519Z _warn_cpu_init() 2025-12-04T13:21:31.3786111Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3786160Z _warn_cpu_init() 2025-12-04T13:21:31.3786760Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3786813Z _warn_cpu_init() 2025-12-04T13:21:31.3787129Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3787205Z return func(*args, **kwargs) 2025-12-04T13:21:31.3787364Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3787560Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3787868Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3788044Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3788386Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3788540Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3788839Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3789054Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3789360Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3789519Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3789821Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3789966Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3790286Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3790450Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3790963Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3791105Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3791307Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3791713Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3791843Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3792094Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3792283Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3792329Z dist init r=1, world=4 2025-12-04T13:21:31.3792512Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3792683Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3792993Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3793159Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3793473Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3793648Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3793958Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3794128Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3794416Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3794585Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3794887Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3795054Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3795355Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3795515Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3796021Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:21:31.3796159Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3796382Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3796765Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3796907Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3797138Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3797323Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3797391Z dist init r=2, world=4 2025-12-04T13:21:31.3797539Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3797725Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3798033Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3798264Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3798590Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3798730Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3799030Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3799191Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3799495Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3799660Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3799966Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3800124Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3800414Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3800592Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3801091Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3801233Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3801451Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3801838Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3801984Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3802216Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3802404Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3802455Z dist init r=0, world=4 2025-12-04T13:21:31.3802618Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3802795Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3803147Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3805749Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3806045Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3806200Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3806482Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3806670Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3806956Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3807126Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3807432Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3807575Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3807892Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3808054Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3808614Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:21:31.3808745Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3808960Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3809360Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3809486Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3809725Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3809915Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3809999Z dist init r=3, world=4 2025-12-04T13:21:31.3810357Z [rank0]:[W1204 13:04:59.550216122 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3810432Z FAILED [39.5386s] [100%] 2025-12-04T13:21:31.3810435Z 2025-12-04T13:21:31.3810515Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3810631Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda _ 2025-12-04T13:21:31.3810696Z Traceback (most recent call last): 2025-12-04T13:21:31.3810884Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3810960Z self._join_processes(fn) 2025-12-04T13:21:31.3811145Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3811222Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3811410Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3811486Z raise RuntimeError(error) 2025-12-04T13:21:31.3811588Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3811657Z Traceback (most recent call last): 2025-12-04T13:21:31.3811829Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3811893Z getattr(self, test_name)() 2025-12-04T13:21:31.3812058Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3812137Z fn() 2025-12-04T13:21:31.3812300Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3812364Z method(*args, **kwargs) 2025-12-04T13:21:31.3812526Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3812584Z method(*args, **kwargs) 2025-12-04T13:21:31.3812781Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3812829Z with policy(): 2025-12-04T13:21:31.3813004Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3813067Z raise RuntimeError(msg) 2025-12-04T13:21:31.3813449Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3813453Z 2025-12-04T13:21:31.3813552Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3813818Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3813821Z 2025-12-04T13:21:31.3813919Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3813921Z 2025-12-04T13:21:31.3813937Z 2025-12-04T13:21:31.3814026Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3814137Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3814406Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ab74cfc34851cb6b.xml - 2025-12-04T13:21:31.3814516Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3814782Z FAILED [39.5386s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3814856Z Traceback (most recent call last): 2025-12-04T13:21:31.3815031Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3815101Z getattr(self, test_name)() 2025-12-04T13:21:31.3815279Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3815335Z fn() 2025-12-04T13:21:31.3815506Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3815572Z method(*args, **kwargs) 2025-12-04T13:21:31.3815730Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3815810Z method(*args, **kwargs) 2025-12-04T13:21:31.3815972Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3816036Z with policy(): 2025-12-04T13:21:31.3816211Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3816257Z raise RuntimeError(msg) 2025-12-04T13:21:31.3816653Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3816656Z 2025-12-04T13:21:31.3816742Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3817010Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3817012Z 2025-12-04T13:21:31.3817109Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3817189Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3817273Z ====================== 1 failed, 18 deselected in 39.70s ======================= 2025-12-04T13:21:31.3817341Z Got exit code 1 2025-12-04T13:21:31.3817554Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3817708Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.3817917Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-1545c1c5fac9b58b.xml 2025-12-04T13:21:31.3817999Z ============================= test session starts ============================== 2025-12-04T13:21:31.3818180Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3818231Z cachedir: .pytest_cache 2025-12-04T13:21:31.3818477Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3818534Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3823610Z configfile: pytest.ini 2025-12-04T13:21:31.3823787Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3823864Z collecting ... collected 60 items / 5 deselected / 55 selected 2025-12-04T13:21:31.3823974Z stepcurrent: skipping 5 already run items. 2025-12-04T13:21:31.3824033Z Running 14 items in this shard 2025-12-04T13:21:31.3824036Z 2025-12-04T13:21:31.3824361Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda I1204 13:05:04.342000 535318 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 535387 2025-12-04T13:21:31.3824516Z I1204 13:05:04.342000 535318 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 535388 2025-12-04T13:21:31.3824669Z I1204 13:05:04.343000 535318 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 535389 2025-12-04T13:21:31.3824820Z I1204 13:05:04.343000 535318 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 535390 2025-12-04T13:21:31.3825404Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3825445Z _warn_cpu_init() 2025-12-04T13:21:31.3826014Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3826052Z _warn_cpu_init() 2025-12-04T13:21:31.3826617Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3826656Z _warn_cpu_init() 2025-12-04T13:21:31.3827243Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3827282Z _warn_cpu_init() 2025-12-04T13:21:31.3827577Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3827621Z return func(*args, **kwargs) 2025-12-04T13:21:31.3827767Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3827930Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3828279Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3828461Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3828773Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3828899Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3829176Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3829327Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3829604Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3829753Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3830031Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3830168Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3830446Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3830594Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3831090Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:21:31.3831208Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3831403Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3831792Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3831907Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3832120Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3832286Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3832326Z dist init r=2, world=4 2025-12-04T13:21:31.3832464Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3832624Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3832932Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3833097Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3833381Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3833505Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3833783Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3833930Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3834207Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3834354Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3834632Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3834767Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3835044Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3835195Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3835683Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3835807Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3836004Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3836376Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3836491Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3836703Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3836869Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3836916Z dist init r=1, world=4 2025-12-04T13:21:31.3837064Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3837234Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3837521Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3837674Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3837959Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3838085Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3838401Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3838548Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3838822Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3838973Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3839251Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3839389Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3839666Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3839813Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3840320Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:21:31.3840434Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3840631Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3841001Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3841115Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3841339Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3841525Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3841564Z dist init r=3, world=4 2025-12-04T13:21:31.3841704Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3841865Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3842150Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3842306Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3842590Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3842714Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3842989Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3843137Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3843413Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3843561Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3843838Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3843975Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3844269Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3844419Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3844907Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3845022Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3845216Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3845594Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3845727Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3845937Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3846102Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3846140Z dist init r=0, world=4 2025-12-04T13:21:31.3846481Z [rank0]:[W1204 13:06:00.571404868 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3846523Z FAILED [58.0537s] [ 7%] 2025-12-04T13:21:31.3846526Z 2025-12-04T13:21:31.3846586Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3846698Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda _ 2025-12-04T13:21:31.3846746Z Traceback (most recent call last): 2025-12-04T13:21:31.3846911Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3846954Z self._join_processes(fn) 2025-12-04T13:21:31.3847127Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3847181Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3847361Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3847404Z raise RuntimeError(error) 2025-12-04T13:21:31.3847489Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.3847534Z Traceback (most recent call last): 2025-12-04T13:21:31.3847695Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3847737Z getattr(self, test_name)() 2025-12-04T13:21:31.3847898Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3847933Z fn() 2025-12-04T13:21:31.3848086Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3848128Z method(*args, **kwargs) 2025-12-04T13:21:31.3848338Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3848378Z method(*args, **kwargs) 2025-12-04T13:21:31.3848532Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3848570Z with policy(): 2025-12-04T13:21:31.3848725Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3848765Z raise RuntimeError(msg) 2025-12-04T13:21:31.3849131Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:21:31.3849133Z 2025-12-04T13:21:31.3849209Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3849470Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3849486Z 2025-12-04T13:21:31.3849576Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3849593Z 2025-12-04T13:21:31.3849595Z 2025-12-04T13:21:31.3849670Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3849760Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3849994Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-1545c1c5fac9b58b.xml - 2025-12-04T13:21:31.3850055Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3850316Z FAILED [58.0537s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.3850367Z Traceback (most recent call last): 2025-12-04T13:21:31.3850533Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3850575Z getattr(self, test_name)() 2025-12-04T13:21:31.3850735Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3850768Z fn() 2025-12-04T13:21:31.3850920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3850962Z method(*args, **kwargs) 2025-12-04T13:21:31.3851112Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3851152Z method(*args, **kwargs) 2025-12-04T13:21:31.3851302Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3851341Z with policy(): 2025-12-04T13:21:31.3851493Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3851534Z raise RuntimeError(msg) 2025-12-04T13:21:31.3851897Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:21:31.3851900Z 2025-12-04T13:21:31.3851973Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3852227Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3852229Z 2025-12-04T13:21:31.3852317Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3852381Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3852445Z ======================= 1 failed, 5 deselected in 58.19s ======================= 2025-12-04T13:21:31.3852482Z Got exit code 1 2025-12-04T13:21:31.3852521Z Retrying single test... 2025-12-04T13:21:31.3852710Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-cd73215e76dd89cf.xml 2025-12-04T13:21:31.3852770Z ============================= test session starts ============================== 2025-12-04T13:21:31.3852881Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3852923Z cachedir: .pytest_cache 2025-12-04T13:21:31.3853081Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3853128Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3853200Z configfile: pytest.ini 2025-12-04T13:21:31.3853366Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3853451Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3853688Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3853731Z Running 1 items in this shard 2025-12-04T13:21:31.3853733Z 2025-12-04T13:21:31.3854054Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda I1204 13:06:04.930000 535720 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 535789 2025-12-04T13:21:31.3854210Z I1204 13:06:04.930000 535720 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 535790 2025-12-04T13:21:31.3854363Z I1204 13:06:04.931000 535720 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 535791 2025-12-04T13:21:31.3854515Z I1204 13:06:04.932000 535720 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 535792 2025-12-04T13:21:31.3855097Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3855136Z _warn_cpu_init() 2025-12-04T13:21:31.3855708Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3855749Z _warn_cpu_init() 2025-12-04T13:21:31.3856323Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3856360Z _warn_cpu_init() 2025-12-04T13:21:31.3856924Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3856964Z _warn_cpu_init() 2025-12-04T13:21:31.3857255Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3857298Z return func(*args, **kwargs) 2025-12-04T13:21:31.3857441Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3857613Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3857930Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3858087Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3858404Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3858530Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3858808Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3858959Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3859235Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3859380Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3859656Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3859794Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3860073Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3860220Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3860723Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3860841Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3861037Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3861408Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3861521Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3861734Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3861909Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3861971Z dist init r=1, world=4 2025-12-04T13:21:31.3862108Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3862268Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3862555Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3862710Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3862997Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3863123Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3863399Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3863547Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3863824Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3863972Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3864247Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3864384Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3864661Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3864819Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3865310Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:21:31.3865427Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3865622Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3865992Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3866116Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3866336Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3866516Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3866554Z dist init r=3, world=4 2025-12-04T13:21:31.3866692Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3866851Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3867140Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3867297Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3867582Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3867706Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3867981Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3868129Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3868445Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3868593Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3868868Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3869003Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3869293Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3869443Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3869934Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:21:31.3870049Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3870244Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3870628Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3870763Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3870973Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3871136Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3871174Z dist init r=2, world=4 2025-12-04T13:21:31.3871311Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3871472Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3871763Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3871917Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3872203Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3872327Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3872603Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3872752Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3873028Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3873175Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3873458Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3873596Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3873874Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3874022Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3874511Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3874636Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3874851Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3875230Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3875344Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3875554Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3875719Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3875757Z dist init r=0, world=4 2025-12-04T13:21:31.3876096Z [rank0]:[W1204 13:07:01.983265094 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3876136Z FAILED [58.0582s] [100%] 2025-12-04T13:21:31.3876140Z 2025-12-04T13:21:31.3876197Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3876309Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda _ 2025-12-04T13:21:31.3876354Z Traceback (most recent call last): 2025-12-04T13:21:31.3876519Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3876562Z self._join_processes(fn) 2025-12-04T13:21:31.3876738Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3876793Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3876972Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3877015Z raise RuntimeError(error) 2025-12-04T13:21:31.3877095Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3877140Z Traceback (most recent call last): 2025-12-04T13:21:31.3877302Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3877344Z getattr(self, test_name)() 2025-12-04T13:21:31.3877513Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3877547Z fn() 2025-12-04T13:21:31.3877705Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3877747Z method(*args, **kwargs) 2025-12-04T13:21:31.3877899Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3877939Z method(*args, **kwargs) 2025-12-04T13:21:31.3878091Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3878128Z with policy(): 2025-12-04T13:21:31.3878346Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3878387Z raise RuntimeError(msg) 2025-12-04T13:21:31.3878769Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3878805Z 2025-12-04T13:21:31.3878881Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3879126Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3879128Z 2025-12-04T13:21:31.3879217Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3879219Z 2025-12-04T13:21:31.3879221Z 2025-12-04T13:21:31.3879296Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3879386Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3879619Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-cd73215e76dd89cf.xml - 2025-12-04T13:21:31.3879682Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3879940Z FAILED [58.0582s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3879987Z Traceback (most recent call last): 2025-12-04T13:21:31.3880150Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3880192Z getattr(self, test_name)() 2025-12-04T13:21:31.3880352Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3880387Z fn() 2025-12-04T13:21:31.3880541Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3880582Z method(*args, **kwargs) 2025-12-04T13:21:31.3880737Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3880776Z method(*args, **kwargs) 2025-12-04T13:21:31.3880925Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3880961Z with policy(): 2025-12-04T13:21:31.3881115Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3881155Z raise RuntimeError(msg) 2025-12-04T13:21:31.3881533Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3881537Z 2025-12-04T13:21:31.3881611Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3881857Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3881859Z 2025-12-04T13:21:31.3881948Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3882009Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3882073Z ====================== 1 failed, 18 deselected in 58.19s ======================= 2025-12-04T13:21:31.3882111Z Got exit code 1 2025-12-04T13:21:31.3882153Z Retrying single test... 2025-12-04T13:21:31.3882344Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-80a41ceace54cce5.xml 2025-12-04T13:21:31.3882415Z ============================= test session starts ============================== 2025-12-04T13:21:31.3882537Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3882591Z cachedir: .pytest_cache 2025-12-04T13:21:31.3882749Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3882797Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3882837Z configfile: pytest.ini 2025-12-04T13:21:31.3883001Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3883076Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3883315Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3883361Z Running 1 items in this shard 2025-12-04T13:21:31.3883364Z 2025-12-04T13:21:31.3883687Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda I1204 13:07:05.671000 536122 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 536191 2025-12-04T13:21:31.3883846Z I1204 13:07:05.672000 536122 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 536192 2025-12-04T13:21:31.3883998Z I1204 13:07:05.673000 536122 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 536193 2025-12-04T13:21:31.3884149Z I1204 13:07:05.673000 536122 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 536194 2025-12-04T13:21:31.3884730Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3884772Z _warn_cpu_init() 2025-12-04T13:21:31.3885340Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3885387Z _warn_cpu_init() 2025-12-04T13:21:31.3885684Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3885729Z return func(*args, **kwargs) 2025-12-04T13:21:31.3886304Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3886341Z _warn_cpu_init() 2025-12-04T13:21:31.3886928Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3886988Z _warn_cpu_init() 2025-12-04T13:21:31.3887132Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3887295Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3887585Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3887741Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3888030Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3888195Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3888477Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3888627Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3888907Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3889055Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3889333Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3889471Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3889753Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3889915Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3890408Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3890527Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3890725Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3891118Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3891248Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3891474Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3891641Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3891679Z dist init r=1, world=4 2025-12-04T13:21:31.3891820Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3891981Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3892271Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3892427Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3892714Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3892838Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3893120Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3893270Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3893547Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3893695Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3893971Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3894119Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3894398Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3894550Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3895041Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:21:31.3895157Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3895366Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3895748Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3895873Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3896083Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3896249Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3896289Z dist init r=3, world=4 2025-12-04T13:21:31.3896427Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3896589Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3896876Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3897031Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3897316Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3897442Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3897722Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3897872Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3898194Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3898341Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3898633Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3898771Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3899049Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3899197Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3899701Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:21:31.3899833Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3900045Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3900418Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3900532Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3900746Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3900910Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3900951Z dist init r=2, world=4 2025-12-04T13:21:31.3901087Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3901248Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3901535Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3901689Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3901975Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3902101Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3902381Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3902529Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3902817Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3902967Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3903243Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3903380Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3903656Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3903807Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3904308Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:21:31.3904444Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3904642Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3905016Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3905131Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3905342Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3905508Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3905546Z dist init r=0, world=4 2025-12-04T13:21:31.3905886Z [rank0]:[W1204 13:08:01.633961664 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3905927Z FAILED [57.8521s] [100%] 2025-12-04T13:21:31.3905929Z 2025-12-04T13:21:31.3905986Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3906099Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda _ 2025-12-04T13:21:31.3906147Z Traceback (most recent call last): 2025-12-04T13:21:31.3906311Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3906355Z self._join_processes(fn) 2025-12-04T13:21:31.3906529Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3906582Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3906761Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3906821Z raise RuntimeError(error) 2025-12-04T13:21:31.3906903Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3906950Z Traceback (most recent call last): 2025-12-04T13:21:31.3907114Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3907157Z getattr(self, test_name)() 2025-12-04T13:21:31.3907317Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3907351Z fn() 2025-12-04T13:21:31.3907504Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3907545Z method(*args, **kwargs) 2025-12-04T13:21:31.3907697Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3907738Z method(*args, **kwargs) 2025-12-04T13:21:31.3907900Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3907947Z with policy(): 2025-12-04T13:21:31.3908101Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3908197Z raise RuntimeError(msg) 2025-12-04T13:21:31.3908562Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3908564Z 2025-12-04T13:21:31.3908641Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3908886Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3908888Z 2025-12-04T13:21:31.3908979Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3908982Z 2025-12-04T13:21:31.3908984Z 2025-12-04T13:21:31.3909059Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3909149Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3909385Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-80a41ceace54cce5.xml - 2025-12-04T13:21:31.3909448Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3909711Z FAILED [57.8521s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.3909757Z Traceback (most recent call last): 2025-12-04T13:21:31.3909922Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3909965Z getattr(self, test_name)() 2025-12-04T13:21:31.3910128Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3910162Z fn() 2025-12-04T13:21:31.3910314Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3910354Z method(*args, **kwargs) 2025-12-04T13:21:31.3910506Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3910545Z method(*args, **kwargs) 2025-12-04T13:21:31.3910709Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3910747Z with policy(): 2025-12-04T13:21:31.3910901Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3910956Z raise RuntimeError(msg) 2025-12-04T13:21:31.3911364Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:21:31.3911366Z 2025-12-04T13:21:31.3911442Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3911688Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3911690Z 2025-12-04T13:21:31.3911780Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3911858Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3911937Z ====================== 1 failed, 18 deselected in 57.99s ======================= 2025-12-04T13:21:31.3911989Z Got exit code 1 2025-12-04T13:21:31.3912183Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.3912311Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.3912502Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-09c520c1ae6de888.xml 2025-12-04T13:21:31.3912560Z ============================= test session starts ============================== 2025-12-04T13:21:31.3912674Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3912715Z cachedir: .pytest_cache 2025-12-04T13:21:31.3912876Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3912924Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3912966Z configfile: pytest.ini 2025-12-04T13:21:31.3913128Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3913204Z collecting ... collected 60 items / 6 deselected / 54 selected 2025-12-04T13:21:31.3913257Z stepcurrent: skipping 6 already run items. 2025-12-04T13:21:31.3913301Z Running 13 items in this shard 2025-12-04T13:21:31.3913303Z 2025-12-04T13:21:31.3913618Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda I1204 13:08:05.974000 536524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 536593 2025-12-04T13:21:31.3913774Z I1204 13:08:05.975000 536524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 536594 2025-12-04T13:21:31.3913928Z I1204 13:08:05.975000 536524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 536595 2025-12-04T13:21:31.3914080Z I1204 13:08:05.976000 536524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 536596 2025-12-04T13:21:31.3914661Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3914708Z _warn_cpu_init() 2025-12-04T13:21:31.3915278Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3915319Z _warn_cpu_init() 2025-12-04T13:21:31.3915883Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3915922Z _warn_cpu_init() 2025-12-04T13:21:31.3916497Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3916554Z _warn_cpu_init() 2025-12-04T13:21:31.3917051Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3917117Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3917608Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3917670Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3918198Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3918258Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3918749Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3918810Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3919104Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3919190Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.3919694Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3919753Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3920046Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3920126Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.3920416Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3920511Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.3920808Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3920907Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.3921404Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3921464Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3921950Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3922011Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3922297Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3922377Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.3922665Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3922744Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.3923032Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3923107Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.3923392Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3923465Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.3924775Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3924906Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3925148Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.3925211Z return func(*args, **kwargs) 2025-12-04T13:21:31.3926491Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3926616Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3927884Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3928008Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3928284Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.3928328Z return func(*args, **kwargs) 2025-12-04T13:21:31.3928565Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.3928607Z return func(*args, **kwargs) 2025-12-04T13:21:31.3929881Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3930016Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3930265Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.3930307Z return func(*args, **kwargs) 2025-12-04T13:21:31.3930528Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.3930569Z return func(*args, **kwargs) 2025-12-04T13:21:31.3930789Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.3930830Z return func(*args, **kwargs) 2025-12-04T13:21:31.3931050Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.3931093Z return func(*args, **kwargs) 2025-12-04T13:21:31.3931315Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.3931355Z return func(*args, **kwargs) 2025-12-04T13:21:31.3931646Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3931687Z return func(*args, **kwargs) 2025-12-04T13:21:31.3931833Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3931998Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3932291Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3932448Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3932735Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3932861Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3933152Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3933303Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3933587Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3933737Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3934014Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3934164Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3934450Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3934611Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3935094Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:21:31.3935211Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3935409Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3935776Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3935893Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3936106Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3936273Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3936312Z dist init r=0, world=4 2025-12-04T13:21:31.3936453Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3936614Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3936902Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3937058Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3937351Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3937479Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3937755Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3937904Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3938224Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3938376Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3938667Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3938833Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3939111Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3939259Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3939742Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832. 2025-12-04T13:21:31.3939858Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3940054Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3940416Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3940532Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3940747Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3940912Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3940951Z dist init r=1, world=4 2025-12-04T13:21:31.3941088Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3941249Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3941550Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3941706Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3941991Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3942116Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3942392Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3942540Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3942825Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3942992Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3943268Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3943405Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3943682Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3943832Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3944312Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616. 2025-12-04T13:21:31.3944428Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3944623Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3944985Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3945101Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3945313Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3945478Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3945516Z dist init r=2, world=4 2025-12-04T13:21:31.3945655Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3945823Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3946111Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3946266Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3946551Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3946675Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3946951Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3947119Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3947403Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3947553Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3947830Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3947969Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3948295Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3948446Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3948924Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968. 2025-12-04T13:21:31.3949038Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3949236Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3949597Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3949711Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3949924Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3950101Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3950141Z dist init r=3, world=4 2025-12-04T13:21:31.3950477Z [rank0]:[W1204 13:08:14.396009769 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3950808Z [rank1]:[W1204 13:08:14.442966570 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3951133Z [rank2]:[W1204 13:08:14.456009417 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3951474Z [rank3]:[W1204 13:08:14.501052750 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3951545Z FAILED [22.9246s] [ 7%] 2025-12-04T13:21:31.3951547Z 2025-12-04T13:21:31.3951605Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3951707Z __ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda ___ 2025-12-04T13:21:31.3951753Z Traceback (most recent call last): 2025-12-04T13:21:31.3951919Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3951962Z self._join_processes(fn) 2025-12-04T13:21:31.3952137Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3952191Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3952373Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3952418Z raise RuntimeError(error) 2025-12-04T13:21:31.3952501Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3952546Z Traceback (most recent call last): 2025-12-04T13:21:31.3952710Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3952752Z getattr(self, test_name)() 2025-12-04T13:21:31.3952910Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3952945Z fn() 2025-12-04T13:21:31.3953100Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3953140Z method(*args, **kwargs) 2025-12-04T13:21:31.3953293Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3953336Z method(*args, **kwargs) 2025-12-04T13:21:31.3953485Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3953522Z with policy(): 2025-12-04T13:21:31.3953673Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3953714Z raise RuntimeError(msg) 2025-12-04T13:21:31.3954079Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:21:31.3954081Z 2025-12-04T13:21:31.3954158Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3954395Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3954399Z 2025-12-04T13:21:31.3954487Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3954489Z 2025-12-04T13:21:31.3954491Z 2025-12-04T13:21:31.3954567Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.3954654Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.3954889Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-09c520c1ae6de888.xml - 2025-12-04T13:21:31.3954949Z =========================== short test summary info ============================ 2025-12-04T13:21:31.3955217Z FAILED [22.9246s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.3955284Z Traceback (most recent call last): 2025-12-04T13:21:31.3955449Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3955491Z getattr(self, test_name)() 2025-12-04T13:21:31.3955652Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3955686Z fn() 2025-12-04T13:21:31.3955839Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3955878Z method(*args, **kwargs) 2025-12-04T13:21:31.3956031Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3956070Z method(*args, **kwargs) 2025-12-04T13:21:31.3956222Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3956260Z with policy(): 2025-12-04T13:21:31.3956413Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3956453Z raise RuntimeError(msg) 2025-12-04T13:21:31.3956814Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:21:31.3956816Z 2025-12-04T13:21:31.3956893Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3957129Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3957132Z 2025-12-04T13:21:31.3957220Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3957284Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.3957346Z ======================= 1 failed, 6 deselected in 23.06s ======================= 2025-12-04T13:21:31.3957383Z Got exit code 1 2025-12-04T13:21:31.3957424Z Retrying single test... 2025-12-04T13:21:31.3957611Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2fb1a3772346bf41.xml 2025-12-04T13:21:31.3957669Z ============================= test session starts ============================== 2025-12-04T13:21:31.3957792Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.3957835Z cachedir: .pytest_cache 2025-12-04T13:21:31.3957994Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.3958044Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.3958083Z configfile: pytest.ini 2025-12-04T13:21:31.3958289Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.3958365Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.3958591Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3958636Z Running 1 items in this shard 2025-12-04T13:21:31.3958638Z 2025-12-04T13:21:31.3958946Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda I1204 13:08:31.443000 537790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 537859 2025-12-04T13:21:31.3959135Z I1204 13:08:31.444000 537790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 537860 2025-12-04T13:21:31.3959299Z I1204 13:08:31.445000 537790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 537861 2025-12-04T13:21:31.3959450Z I1204 13:08:31.445000 537790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 537862 2025-12-04T13:21:31.3960032Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3960071Z _warn_cpu_init() 2025-12-04T13:21:31.3960568Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3960630Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3961200Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3961237Z _warn_cpu_init() 2025-12-04T13:21:31.3961733Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3961795Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3962372Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3962412Z _warn_cpu_init() 2025-12-04T13:21:31.3962973Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.3963011Z _warn_cpu_init() 2025-12-04T13:21:31.3963304Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3963388Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.3963892Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3963969Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3964260Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3964338Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.3964625Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3964708Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.3965199Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3965259Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3965745Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3965805Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3966290Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3966347Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3966648Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3966724Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.3967011Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3967092Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.3967375Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3967454Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.3967953Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.3968029Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.3968361Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3968437Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.3968724Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.3968800Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.3970083Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3970208Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3970439Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.3970484Z return func(*args, **kwargs) 2025-12-04T13:21:31.3971777Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3971902Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3972128Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.3972170Z return func(*args, **kwargs) 2025-12-04T13:21:31.3973454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3973598Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3973825Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.3973869Z return func(*args, **kwargs) 2025-12-04T13:21:31.3975133Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.3975256Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.3975482Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.3975522Z return func(*args, **kwargs) 2025-12-04T13:21:31.3975745Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.3975785Z return func(*args, **kwargs) 2025-12-04T13:21:31.3976018Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.3976058Z return func(*args, **kwargs) 2025-12-04T13:21:31.3976279Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.3976320Z return func(*args, **kwargs) 2025-12-04T13:21:31.3976538Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.3976577Z return func(*args, **kwargs) 2025-12-04T13:21:31.3976869Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.3976909Z return func(*args, **kwargs) 2025-12-04T13:21:31.3977055Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3977228Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3977538Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3977695Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3977980Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3978109Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3978502Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3978653Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3978928Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3979076Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3979352Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3979490Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3979767Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3979915Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3980408Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968. 2025-12-04T13:21:31.3980528Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3980727Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3981091Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3981205Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3981418Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3981594Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.3981659Z dist init r=3, world=4 2025-12-04T13:21:31.3981796Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3981956Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3982243Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3982398Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3982683Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3982810Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3983089Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3983235Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3983510Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3983660Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3983936Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3984074Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3984349Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3984507Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3984988Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832. 2025-12-04T13:21:31.3985106Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3985304Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3985665Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3985789Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3986009Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3986189Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.3986228Z dist init r=1, world=4 2025-12-04T13:21:31.3986366Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3986525Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3986813Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3986970Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3987255Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3987379Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3987657Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3987806Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3988082Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3988276Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3988552Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3988686Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3988977Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3989127Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3989608Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616. 2025-12-04T13:21:31.3989722Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3989918Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3990291Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3990428Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3990639Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3990802Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.3990842Z dist init r=2, world=4 2025-12-04T13:21:31.3990981Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.3991142Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.3991428Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3991583Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.3991868Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3991992Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.3992271Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3992420Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3992740Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3992917Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.3993207Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3993346Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.3993623Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3993772Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.3994251Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:21:31.3994377Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3994584Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3994956Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3995070Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.3995280Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3995444Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.3995483Z dist init r=0, world=4 2025-12-04T13:21:31.3995821Z [rank3]:[W1204 13:08:40.128034835 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3996151Z [rank1]:[W1204 13:08:40.362097494 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3996478Z [rank2]:[W1204 13:08:40.467382685 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3996805Z [rank0]:[W1204 13:08:40.503820870 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.3996847Z FAILED [23.0261s] [100%] 2025-12-04T13:21:31.3996849Z 2025-12-04T13:21:31.3996906Z =================================== FAILURES =================================== 2025-12-04T13:21:31.3997007Z __ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda ___ 2025-12-04T13:21:31.3997054Z Traceback (most recent call last): 2025-12-04T13:21:31.3997218Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.3997263Z self._join_processes(fn) 2025-12-04T13:21:31.3997445Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.3997503Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.3997682Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.3997728Z raise RuntimeError(error) 2025-12-04T13:21:31.3997808Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.3997854Z Traceback (most recent call last): 2025-12-04T13:21:31.3998014Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.3998056Z getattr(self, test_name)() 2025-12-04T13:21:31.3998249Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.3998283Z fn() 2025-12-04T13:21:31.3998438Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3998505Z method(*args, **kwargs) 2025-12-04T13:21:31.3998657Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.3998709Z method(*args, **kwargs) 2025-12-04T13:21:31.3998860Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.3998896Z with policy(): 2025-12-04T13:21:31.3999049Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.3999089Z raise RuntimeError(msg) 2025-12-04T13:21:31.3999449Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968. 2025-12-04T13:21:31.3999451Z 2025-12-04T13:21:31.3999528Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.3999764Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.3999767Z 2025-12-04T13:21:31.3999855Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.3999858Z 2025-12-04T13:21:31.3999860Z 2025-12-04T13:21:31.3999934Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4000022Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4000257Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2fb1a3772346bf41.xml - 2025-12-04T13:21:31.4000317Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4000570Z FAILED [23.0261s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4000619Z Traceback (most recent call last): 2025-12-04T13:21:31.4000782Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4000825Z getattr(self, test_name)() 2025-12-04T13:21:31.4000984Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4001019Z fn() 2025-12-04T13:21:31.4001170Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4001222Z method(*args, **kwargs) 2025-12-04T13:21:31.4001374Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4001415Z method(*args, **kwargs) 2025-12-04T13:21:31.4001567Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4001605Z with policy(): 2025-12-04T13:21:31.4001757Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4001798Z raise RuntimeError(msg) 2025-12-04T13:21:31.4002153Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968. 2025-12-04T13:21:31.4002156Z 2025-12-04T13:21:31.4002231Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4002477Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4002509Z 2025-12-04T13:21:31.4002596Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4002660Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4002722Z ====================== 1 failed, 18 deselected in 23.16s ======================= 2025-12-04T13:21:31.4002760Z Got exit code 1 2025-12-04T13:21:31.4002800Z Retrying single test... 2025-12-04T13:21:31.4002991Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2e65debb59102de6.xml 2025-12-04T13:21:31.4003049Z ============================= test session starts ============================== 2025-12-04T13:21:31.4003162Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4003204Z cachedir: .pytest_cache 2025-12-04T13:21:31.4003364Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4003411Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4003452Z configfile: pytest.ini 2025-12-04T13:21:31.4003614Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4003690Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4003917Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4003961Z Running 1 items in this shard 2025-12-04T13:21:31.4003963Z 2025-12-04T13:21:31.4004272Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda I1204 13:08:57.247000 539056 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 539125 2025-12-04T13:21:31.4004429Z I1204 13:08:57.248000 539056 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 539126 2025-12-04T13:21:31.4004581Z I1204 13:08:57.248000 539056 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 539127 2025-12-04T13:21:31.4004730Z I1204 13:08:57.249000 539056 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 539128 2025-12-04T13:21:31.4005330Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4005369Z _warn_cpu_init() 2025-12-04T13:21:31.4005862Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4005925Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4006509Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4006567Z _warn_cpu_init() 2025-12-04T13:21:31.4007061Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4007120Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4007692Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4007729Z _warn_cpu_init() 2025-12-04T13:21:31.4008400Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4008459Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4009028Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4009069Z _warn_cpu_init() 2025-12-04T13:21:31.4009363Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4009450Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4009736Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4009833Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4010329Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4010389Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4010679Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4010756Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4011044Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4011149Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4011448Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4011527Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4012022Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4012081Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4012566Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4012626Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4012913Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4012988Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4013275Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4013356Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4013849Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4013906Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4014207Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4014280Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4015563Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4015702Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4015951Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4015996Z return func(*args, **kwargs) 2025-12-04T13:21:31.4017267Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4017391Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4017620Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4017663Z return func(*args, **kwargs) 2025-12-04T13:21:31.4018965Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4019102Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4019331Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4019374Z return func(*args, **kwargs) 2025-12-04T13:21:31.4020654Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4020800Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4021024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4021066Z return func(*args, **kwargs) 2025-12-04T13:21:31.4021287Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4021327Z return func(*args, **kwargs) 2025-12-04T13:21:31.4021549Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4021591Z return func(*args, **kwargs) 2025-12-04T13:21:31.4021812Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4021853Z return func(*args, **kwargs) 2025-12-04T13:21:31.4022071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4022112Z return func(*args, **kwargs) 2025-12-04T13:21:31.4022404Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4022445Z return func(*args, **kwargs) 2025-12-04T13:21:31.4022592Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4022756Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4023048Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4023204Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4023498Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4023625Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4023903Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4024054Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4024331Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4024481Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4024765Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4024921Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4025198Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4025348Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4025832Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:21:31.4025949Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4026147Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4026513Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4026628Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4026841Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4027008Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4027048Z dist init r=0, world=4 2025-12-04T13:21:31.4027188Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4027348Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4027635Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4027804Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4028089Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4028253Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4028529Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4028679Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4028959Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4029134Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4029422Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4029558Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4029835Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4029985Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4030468Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968. 2025-12-04T13:21:31.4030585Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4030779Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4031144Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4031260Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4031474Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4031638Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4031776Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4031933Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4032234Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4032390Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4032673Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4032797Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4033073Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4033233Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4033519Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4033677Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4033952Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4034088Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4034365Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4034514Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4034994Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616. 2025-12-04T13:21:31.4035109Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4035305Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4035667Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4035782Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4035994Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4036157Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4036197Z dist init r=3, world=4 2025-12-04T13:21:31.4036250Z dist init r=2, world=4 2025-12-04T13:21:31.4036389Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4036549Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4036836Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4036990Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4037274Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4037400Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4037683Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4037853Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4038131Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4038316Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4038592Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4038728Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4039009Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4039156Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4039636Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832. 2025-12-04T13:21:31.4039751Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4039947Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4040307Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4040420Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4040644Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4040809Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4040849Z dist init r=1, world=4 2025-12-04T13:21:31.4041186Z [rank0]:[W1204 13:09:06.669658255 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4041516Z [rank3]:[W1204 13:09:06.670956054 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4041856Z [rank2]:[W1204 13:09:06.671696362 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4042194Z [rank1]:[W1204 13:09:06.871387068 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4042250Z FAILED [22.8248s] [100%] 2025-12-04T13:21:31.4042252Z 2025-12-04T13:21:31.4042309Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4042411Z __ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda ___ 2025-12-04T13:21:31.4042456Z Traceback (most recent call last): 2025-12-04T13:21:31.4042622Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4042665Z self._join_processes(fn) 2025-12-04T13:21:31.4042843Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4042899Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4043078Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4043122Z raise RuntimeError(error) 2025-12-04T13:21:31.4043204Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4043250Z Traceback (most recent call last): 2025-12-04T13:21:31.4043411Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4043454Z getattr(self, test_name)() 2025-12-04T13:21:31.4043613Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4043647Z fn() 2025-12-04T13:21:31.4043800Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4043843Z method(*args, **kwargs) 2025-12-04T13:21:31.4043994Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4044034Z method(*args, **kwargs) 2025-12-04T13:21:31.4044183Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4044221Z with policy(): 2025-12-04T13:21:31.4044371Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4044412Z raise RuntimeError(msg) 2025-12-04T13:21:31.4044781Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616. 2025-12-04T13:21:31.4044785Z 2025-12-04T13:21:31.4044864Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4045101Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4045105Z 2025-12-04T13:21:31.4045192Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4045195Z 2025-12-04T13:21:31.4045255Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4045300Z Traceback (most recent call last): 2025-12-04T13:21:31.4045465Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4045507Z getattr(self, test_name)() 2025-12-04T13:21:31.4045679Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4045732Z fn() 2025-12-04T13:21:31.4045884Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4045923Z method(*args, **kwargs) 2025-12-04T13:21:31.4046073Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4046113Z method(*args, **kwargs) 2025-12-04T13:21:31.4046263Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4046299Z with policy(): 2025-12-04T13:21:31.4046452Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4046492Z raise RuntimeError(msg) 2025-12-04T13:21:31.4046848Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968. 2025-12-04T13:21:31.4046852Z 2025-12-04T13:21:31.4046925Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4047157Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4047159Z 2025-12-04T13:21:31.4047247Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4047249Z 2025-12-04T13:21:31.4047251Z 2025-12-04T13:21:31.4047328Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4047418Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4047655Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2e65debb59102de6.xml - 2025-12-04T13:21:31.4047719Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4047969Z FAILED [22.8248s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4048016Z Traceback (most recent call last): 2025-12-04T13:21:31.4048213Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4048257Z getattr(self, test_name)() 2025-12-04T13:21:31.4048438Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4048474Z fn() 2025-12-04T13:21:31.4048627Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4048667Z method(*args, **kwargs) 2025-12-04T13:21:31.4048819Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4048857Z method(*args, **kwargs) 2025-12-04T13:21:31.4049008Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4049044Z with policy(): 2025-12-04T13:21:31.4049195Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4049235Z raise RuntimeError(msg) 2025-12-04T13:21:31.4049604Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616. 2025-12-04T13:21:31.4049635Z 2025-12-04T13:21:31.4049709Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4049943Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4049945Z 2025-12-04T13:21:31.4050032Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4050036Z 2025-12-04T13:21:31.4050094Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4050138Z Traceback (most recent call last): 2025-12-04T13:21:31.4050301Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4050343Z getattr(self, test_name)() 2025-12-04T13:21:31.4050503Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4050539Z fn() 2025-12-04T13:21:31.4050689Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4050732Z method(*args, **kwargs) 2025-12-04T13:21:31.4050881Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4050921Z method(*args, **kwargs) 2025-12-04T13:21:31.4051070Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4051107Z with policy(): 2025-12-04T13:21:31.4051258Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4051300Z raise RuntimeError(msg) 2025-12-04T13:21:31.4051652Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968. 2025-12-04T13:21:31.4051656Z 2025-12-04T13:21:31.4051730Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4051962Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4051965Z 2025-12-04T13:21:31.4052052Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4052117Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4052190Z ====================== 1 failed, 18 deselected in 22.96s ======================= 2025-12-04T13:21:31.4052229Z Got exit code 1 2025-12-04T13:21:31.4052411Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:21:31.4052542Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.4052730Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d0307ff0aa7f20f8.xml 2025-12-04T13:21:31.4052789Z ============================= test session starts ============================== 2025-12-04T13:21:31.4052900Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4052942Z cachedir: .pytest_cache 2025-12-04T13:21:31.4053101Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4053148Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4053188Z configfile: pytest.ini 2025-12-04T13:21:31.4053361Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4053454Z collecting ... collected 60 items / 7 deselected / 53 selected 2025-12-04T13:21:31.4053508Z stepcurrent: skipping 7 already run items. 2025-12-04T13:21:31.4053549Z Running 12 items in this shard 2025-12-04T13:21:31.4053551Z 2025-12-04T13:21:31.4053858Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda I1204 13:09:22.599000 540322 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 540391 2025-12-04T13:21:31.4054015Z I1204 13:09:22.600000 540322 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 540392 2025-12-04T13:21:31.4054169Z I1204 13:09:22.600000 540322 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 540393 2025-12-04T13:21:31.4054323Z I1204 13:09:22.601000 540322 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 540394 2025-12-04T13:21:31.4054909Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4054949Z _warn_cpu_init() 2025-12-04T13:21:31.4055250Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4055289Z _init_core_state( 2025-12-04T13:21:31.4055783Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4055847Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4056429Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4056466Z _warn_cpu_init() 2025-12-04T13:21:31.4056766Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4056804Z _init_core_state( 2025-12-04T13:21:31.4057296Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4057357Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4057936Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4057994Z _warn_cpu_init() 2025-12-04T13:21:31.4058337Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4058375Z _init_core_state( 2025-12-04T13:21:31.4058865Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4058926Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4059499Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4059535Z _warn_cpu_init() 2025-12-04T13:21:31.4060025Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4060083Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4060571Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4060628Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4060937Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4060976Z _init_core_state( 2025-12-04T13:21:31.4061462Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4061522Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4062011Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4062068Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4063363Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4063512Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4064788Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4064912Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4066186Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4066309Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4067587Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4067734Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4067963Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4068006Z return func(*args, **kwargs) 2025-12-04T13:21:31.4068268Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4068309Z return func(*args, **kwargs) 2025-12-04T13:21:31.4068534Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4068575Z return func(*args, **kwargs) 2025-12-04T13:21:31.4068798Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4068838Z return func(*args, **kwargs) 2025-12-04T13:21:31.4069062Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4069103Z return func(*args, **kwargs) 2025-12-04T13:21:31.4069325Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4069367Z return func(*args, **kwargs) 2025-12-04T13:21:31.4069586Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4069628Z return func(*args, **kwargs) 2025-12-04T13:21:31.4069846Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4069886Z return func(*args, **kwargs) 2025-12-04T13:21:31.4070188Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4070229Z return func(*args, **kwargs) 2025-12-04T13:21:31.4070374Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4070540Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4070831Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4070988Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4071275Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4071412Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4071706Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4071869Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4072144Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4072292Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4072569Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4072707Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4072985Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4073134Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4073613Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T13:21:31.4073731Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4073931Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4074291Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4074406Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4074628Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4074795Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4074835Z dist init r=0, world=4 2025-12-04T13:21:31.4074974Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4075133Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4075422Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4075576Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4078127Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4078496Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4078778Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4078929Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4079207Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4079356Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4079632Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4079771Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4080048Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4080199Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4080680Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T13:21:31.4080798Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4080997Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4081370Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4081486Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4081701Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4081865Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4081905Z dist init r=1, world=4 2025-12-04T13:21:31.4082044Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4082204Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4082509Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4082687Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4082970Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4083096Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4083376Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4083524Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4083799Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4083946Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4084221Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4084356Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4084635Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4084784Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4085263Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T13:21:31.4085378Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4085584Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4085942Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4086056Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4086268Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4086432Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4086471Z dist init r=3, world=4 2025-12-04T13:21:31.4086609Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4086777Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4087082Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4087235Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4087519Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4087643Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4087924Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4088074Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4088383Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4088529Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4088805Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4088943Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4089220Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4089367Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4089856Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T13:21:31.4089970Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4090168Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4090524Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4090637Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4090849Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4091026Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4091096Z dist init r=2, world=4 2025-12-04T13:21:31.4091433Z [rank0]:[W1204 13:09:31.066178555 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4091761Z [rank1]:[W1204 13:09:31.077287465 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4092087Z [rank2]:[W1204 13:09:31.199268541 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4092415Z [rank3]:[W1204 13:09:31.211099768 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4092455Z FAILED [22.8268s] [ 8%] 2025-12-04T13:21:31.4092459Z 2025-12-04T13:21:31.4092518Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4092620Z ____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda _____ 2025-12-04T13:21:31.4092666Z Traceback (most recent call last): 2025-12-04T13:21:31.4092833Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4092877Z self._join_processes(fn) 2025-12-04T13:21:31.4093052Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4093107Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4093287Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4093330Z raise RuntimeError(error) 2025-12-04T13:21:31.4093411Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4093456Z Traceback (most recent call last): 2025-12-04T13:21:31.4093618Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4093659Z getattr(self, test_name)() 2025-12-04T13:21:31.4093828Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4093864Z fn() 2025-12-04T13:21:31.4094017Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4094058Z method(*args, **kwargs) 2025-12-04T13:21:31.4094210Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4094250Z method(*args, **kwargs) 2025-12-04T13:21:31.4094400Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4094436Z with policy(): 2025-12-04T13:21:31.4094588Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4094629Z raise RuntimeError(msg) 2025-12-04T13:21:31.4094995Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T13:21:31.4095007Z 2025-12-04T13:21:31.4095084Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4095327Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4095329Z 2025-12-04T13:21:31.4095419Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4095421Z 2025-12-04T13:21:31.4095424Z 2025-12-04T13:21:31.4095499Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4095587Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4095823Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d0307ff0aa7f20f8.xml - 2025-12-04T13:21:31.4095884Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4096131Z FAILED [22.8268s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4096178Z Traceback (most recent call last): 2025-12-04T13:21:31.4096343Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4096384Z getattr(self, test_name)() 2025-12-04T13:21:31.4096544Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4096577Z fn() 2025-12-04T13:21:31.4096730Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4096769Z method(*args, **kwargs) 2025-12-04T13:21:31.4096920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4096960Z method(*args, **kwargs) 2025-12-04T13:21:31.4097110Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4097146Z with policy(): 2025-12-04T13:21:31.4097299Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4097338Z raise RuntimeError(msg) 2025-12-04T13:21:31.4097702Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T13:21:31.4097705Z 2025-12-04T13:21:31.4097779Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4098009Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4098013Z 2025-12-04T13:21:31.4098101Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4098207Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4098270Z ======================= 1 failed, 7 deselected in 22.96s ======================= 2025-12-04T13:21:31.4098306Z Got exit code 1 2025-12-04T13:21:31.4098346Z Retrying single test... 2025-12-04T13:21:31.4098536Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6e1069ec4db63983.xml 2025-12-04T13:21:31.4098594Z ============================= test session starts ============================== 2025-12-04T13:21:31.4098723Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4098776Z cachedir: .pytest_cache 2025-12-04T13:21:31.4098934Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4098993Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4099033Z configfile: pytest.ini 2025-12-04T13:21:31.4099197Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4099273Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4099496Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4099540Z Running 1 items in this shard 2025-12-04T13:21:31.4099543Z 2025-12-04T13:21:31.4099853Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda I1204 13:09:47.944000 541588 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 541657 2025-12-04T13:21:31.4100009Z I1204 13:09:47.945000 541588 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 541658 2025-12-04T13:21:31.4100162Z I1204 13:09:47.946000 541588 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 541659 2025-12-04T13:21:31.4100312Z I1204 13:09:47.946000 541588 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 541660 2025-12-04T13:21:31.4100900Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4100939Z _warn_cpu_init() 2025-12-04T13:21:31.4101238Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4101276Z _init_core_state( 2025-12-04T13:21:31.4101790Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4101854Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4102432Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4102471Z _warn_cpu_init() 2025-12-04T13:21:31.4102765Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4102801Z _init_core_state( 2025-12-04T13:21:31.4103306Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4103389Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4103957Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4103995Z _warn_cpu_init() 2025-12-04T13:21:31.4104288Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4104326Z _init_core_state( 2025-12-04T13:21:31.4104819Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4104878Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4105449Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4105486Z _warn_cpu_init() 2025-12-04T13:21:31.4105974Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4106031Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4106531Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4106590Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4106885Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4106922Z _init_core_state( 2025-12-04T13:21:31.4107410Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4107467Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4107965Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4108050Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4109372Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4109503Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4110776Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4110901Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4112187Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4112309Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4113588Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4113734Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4113964Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4114008Z return func(*args, **kwargs) 2025-12-04T13:21:31.4114233Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4114275Z return func(*args, **kwargs) 2025-12-04T13:21:31.4114499Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4114539Z return func(*args, **kwargs) 2025-12-04T13:21:31.4114763Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4114803Z return func(*args, **kwargs) 2025-12-04T13:21:31.4115024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4115065Z return func(*args, **kwargs) 2025-12-04T13:21:31.4115283Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4115322Z return func(*args, **kwargs) 2025-12-04T13:21:31.4115542Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4115582Z return func(*args, **kwargs) 2025-12-04T13:21:31.4115814Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4115856Z return func(*args, **kwargs) 2025-12-04T13:21:31.4116147Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4116187Z return func(*args, **kwargs) 2025-12-04T13:21:31.4116332Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4116497Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4116791Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4116961Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4117265Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4117390Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4117668Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4117816Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4118093Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4118279Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4118555Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4118691Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4118970Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4119120Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4119601Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T13:21:31.4119718Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4119931Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4120290Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4120407Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4120619Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4120784Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4120822Z dist init r=0, world=4 2025-12-04T13:21:31.4120961Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4121120Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4121435Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4121601Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4121887Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4122011Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4122289Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4122438Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4122714Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4122861Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4123138Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4123274Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4123552Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4123703Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4124180Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T13:21:31.4124303Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4124500Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4124856Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4124970Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4125182Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4125345Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4125384Z dist init r=3, world=4 2025-12-04T13:21:31.4125545Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4125716Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4126006Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4126165Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4126451Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4126577Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4126854Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4127000Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4127276Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4127422Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4127698Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4127834Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4128111Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4128304Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4128795Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T13:21:31.4128911Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4129104Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4129459Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4129572Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4129799Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4129989Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4130027Z dist init r=2, world=4 2025-12-04T13:21:31.4130166Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4130324Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4130611Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4130765Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4131050Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4131174Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4131449Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4131597Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4131874Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4132023Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4132296Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4132432Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4132716Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4132866Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4133345Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T13:21:31.4133459Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4133654Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4134016Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4134139Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4134360Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4134524Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4134563Z dist init r=1, world=4 2025-12-04T13:21:31.4134900Z [rank0]:[W1204 13:09:56.483041909 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4135230Z [rank3]:[W1204 13:09:56.497385556 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4135561Z [rank2]:[W1204 13:09:56.499735778 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4135888Z [rank1]:[W1204 13:09:56.621393473 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4135927Z FAILED [23.0278s] [100%] 2025-12-04T13:21:31.4135931Z 2025-12-04T13:21:31.4135989Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4136091Z ____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda _____ 2025-12-04T13:21:31.4136138Z Traceback (most recent call last): 2025-12-04T13:21:31.4136303Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4136346Z self._join_processes(fn) 2025-12-04T13:21:31.4136519Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4136572Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4136750Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4136792Z raise RuntimeError(error) 2025-12-04T13:21:31.4136885Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4136929Z Traceback (most recent call last): 2025-12-04T13:21:31.4137091Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4137133Z getattr(self, test_name)() 2025-12-04T13:21:31.4137292Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4137326Z fn() 2025-12-04T13:21:31.4137478Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4137517Z method(*args, **kwargs) 2025-12-04T13:21:31.4137669Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4137709Z method(*args, **kwargs) 2025-12-04T13:21:31.4137861Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4137898Z with policy(): 2025-12-04T13:21:31.4138071Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4138125Z raise RuntimeError(msg) 2025-12-04T13:21:31.4138508Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T13:21:31.4138511Z 2025-12-04T13:21:31.4138586Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4138815Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4138818Z 2025-12-04T13:21:31.4138907Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4138910Z 2025-12-04T13:21:31.4138971Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4139017Z Traceback (most recent call last): 2025-12-04T13:21:31.4139180Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4139221Z getattr(self, test_name)() 2025-12-04T13:21:31.4139378Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4139414Z fn() 2025-12-04T13:21:31.4139564Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4139605Z method(*args, **kwargs) 2025-12-04T13:21:31.4139754Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4139795Z method(*args, **kwargs) 2025-12-04T13:21:31.4139944Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4139983Z with policy(): 2025-12-04T13:21:31.4140134Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4140175Z raise RuntimeError(msg) 2025-12-04T13:21:31.4140525Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T13:21:31.4140528Z 2025-12-04T13:21:31.4140601Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4140854Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4140858Z 2025-12-04T13:21:31.4140945Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4140948Z 2025-12-04T13:21:31.4141008Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4141052Z Traceback (most recent call last): 2025-12-04T13:21:31.4141216Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4141257Z getattr(self, test_name)() 2025-12-04T13:21:31.4141415Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4141448Z fn() 2025-12-04T13:21:31.4141600Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4141638Z method(*args, **kwargs) 2025-12-04T13:21:31.4141802Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4141854Z method(*args, **kwargs) 2025-12-04T13:21:31.4142017Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4142054Z with policy(): 2025-12-04T13:21:31.4142205Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4142245Z raise RuntimeError(msg) 2025-12-04T13:21:31.4142595Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T13:21:31.4142598Z 2025-12-04T13:21:31.4142671Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4142898Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4142902Z 2025-12-04T13:21:31.4142989Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4142991Z 2025-12-04T13:21:31.4142993Z 2025-12-04T13:21:31.4143069Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4143156Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4143389Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6e1069ec4db63983.xml - 2025-12-04T13:21:31.4143448Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4143695Z FAILED [23.0278s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4143741Z Traceback (most recent call last): 2025-12-04T13:21:31.4143907Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4143948Z getattr(self, test_name)() 2025-12-04T13:21:31.4144106Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4144139Z fn() 2025-12-04T13:21:31.4144289Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4144327Z method(*args, **kwargs) 2025-12-04T13:21:31.4144488Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4144526Z method(*args, **kwargs) 2025-12-04T13:21:31.4144679Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4144717Z with policy(): 2025-12-04T13:21:31.4144868Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4144908Z raise RuntimeError(msg) 2025-12-04T13:21:31.4145257Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T13:21:31.4145259Z 2025-12-04T13:21:31.4145332Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4145558Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4145560Z 2025-12-04T13:21:31.4145667Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4145681Z 2025-12-04T13:21:31.4145739Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4145784Z Traceback (most recent call last): 2025-12-04T13:21:31.4145945Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4145988Z getattr(self, test_name)() 2025-12-04T13:21:31.4146146Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4146180Z fn() 2025-12-04T13:21:31.4146329Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4146369Z method(*args, **kwargs) 2025-12-04T13:21:31.4146519Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4146560Z method(*args, **kwargs) 2025-12-04T13:21:31.4146709Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4146746Z with policy(): 2025-12-04T13:21:31.4146897Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4146937Z raise RuntimeError(msg) 2025-12-04T13:21:31.4147286Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T13:21:31.4147288Z 2025-12-04T13:21:31.4147361Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4147588Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4147592Z 2025-12-04T13:21:31.4147679Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4147681Z 2025-12-04T13:21:31.4147739Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4147784Z Traceback (most recent call last): 2025-12-04T13:21:31.4147946Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4147987Z getattr(self, test_name)() 2025-12-04T13:21:31.4148179Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4148227Z fn() 2025-12-04T13:21:31.4148378Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4148419Z method(*args, **kwargs) 2025-12-04T13:21:31.4148569Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4148609Z method(*args, **kwargs) 2025-12-04T13:21:31.4148758Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4148795Z with policy(): 2025-12-04T13:21:31.4148945Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4148986Z raise RuntimeError(msg) 2025-12-04T13:21:31.4149337Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T13:21:31.4149339Z 2025-12-04T13:21:31.4149441Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4149679Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4149681Z 2025-12-04T13:21:31.4149767Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4149830Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4149895Z ====================== 1 failed, 18 deselected in 23.16s ======================= 2025-12-04T13:21:31.4149933Z Got exit code 1 2025-12-04T13:21:31.4149973Z Retrying single test... 2025-12-04T13:21:31.4150162Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0036e35144f9d74b.xml 2025-12-04T13:21:31.4150221Z ============================= test session starts ============================== 2025-12-04T13:21:31.4150336Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4150377Z cachedir: .pytest_cache 2025-12-04T13:21:31.4150538Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4150583Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4150625Z configfile: pytest.ini 2025-12-04T13:21:31.4150788Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4150863Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4151084Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4151128Z Running 1 items in this shard 2025-12-04T13:21:31.4151130Z 2025-12-04T13:21:31.4151435Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda I1204 13:10:13.383000 542854 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 542923 2025-12-04T13:21:31.4151592Z I1204 13:10:13.384000 542854 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 542924 2025-12-04T13:21:31.4151745Z I1204 13:10:13.385000 542854 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 542925 2025-12-04T13:21:31.4151895Z I1204 13:10:13.386000 542854 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 542926 2025-12-04T13:21:31.4152493Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4152532Z _warn_cpu_init() 2025-12-04T13:21:31.4152830Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4152866Z _init_core_state( 2025-12-04T13:21:31.4153363Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4153435Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4154023Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4154079Z _warn_cpu_init() 2025-12-04T13:21:31.4154372Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4154410Z _init_core_state( 2025-12-04T13:21:31.4154900Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4154962Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4155533Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4155569Z _warn_cpu_init() 2025-12-04T13:21:31.4155862Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4155900Z _init_core_state( 2025-12-04T13:21:31.4156393Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4156452Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4157033Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4157072Z _warn_cpu_init() 2025-12-04T13:21:31.4157559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4157617Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4158110Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4158218Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4158512Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4158548Z _init_core_state( 2025-12-04T13:21:31.4159040Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4159098Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4159583Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4159641Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4160922Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4161048Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4162335Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4162459Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4163737Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4163885Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4165154Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:21:31.4165276Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:21:31.4165505Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4165548Z return func(*args, **kwargs) 2025-12-04T13:21:31.4165770Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4165812Z return func(*args, **kwargs) 2025-12-04T13:21:31.4166034Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4166084Z return func(*args, **kwargs) 2025-12-04T13:21:31.4166307Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4166350Z return func(*args, **kwargs) 2025-12-04T13:21:31.4166570Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4166611Z return func(*args, **kwargs) 2025-12-04T13:21:31.4166830Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4166870Z return func(*args, **kwargs) 2025-12-04T13:21:31.4167090Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4167130Z return func(*args, **kwargs) 2025-12-04T13:21:31.4167361Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4167420Z return func(*args, **kwargs) 2025-12-04T13:21:31.4167712Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4167751Z return func(*args, **kwargs) 2025-12-04T13:21:31.4167896Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4168059Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4168383Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4168540Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4168829Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4168954Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4169233Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4169384Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4169661Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4169810Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4170085Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4170223Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4170514Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4170667Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4171149Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T13:21:31.4171264Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4171461Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4171829Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4171975Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4172187Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4172353Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4172392Z dist init r=0, world=4 2025-12-04T13:21:31.4172532Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4172692Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4172980Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4173135Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4173421Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4173547Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4173822Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4173972Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4174247Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4174393Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4174680Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4174818Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4175096Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4175244Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4175723Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224. 2025-12-04T13:21:31.4175848Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4176061Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4176417Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4176530Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4176743Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4176907Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4176947Z dist init r=2, world=4 2025-12-04T13:21:31.4177087Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4177246Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4177532Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4177685Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4177974Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4178099Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4178412Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4178559Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4178849Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4178997Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4179273Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4179410Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4179686Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4179835Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4180324Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440. 2025-12-04T13:21:31.4180463Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4180659Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4181013Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4181127Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4181340Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4181505Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4181543Z dist init r=1, world=4 2025-12-04T13:21:31.4181681Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4181839Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4182126Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4182283Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4182570Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4182695Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4182970Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4183129Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4183405Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4183554Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4183829Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4183965Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4184244Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4184411Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4184898Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576. 2025-12-04T13:21:31.4185014Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4185210Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4185565Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4185679Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4185889Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4186052Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4186090Z dist init r=3, world=4 2025-12-04T13:21:31.4186427Z [rank0]:[W1204 13:10:22.920921713 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4186758Z [rank2]:[W1204 13:10:22.968009349 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4187087Z [rank1]:[W1204 13:10:22.053992346 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4187425Z [rank3]:[W1204 13:10:22.149778383 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4187467Z FAILED [22.8258s] [100%] 2025-12-04T13:21:31.4187469Z 2025-12-04T13:21:31.4187528Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4187631Z ____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda _____ 2025-12-04T13:21:31.4187676Z Traceback (most recent call last): 2025-12-04T13:21:31.4187842Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4187885Z self._join_processes(fn) 2025-12-04T13:21:31.4188059Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4188113Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4188324Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4188368Z raise RuntimeError(error) 2025-12-04T13:21:31.4188466Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4188524Z Traceback (most recent call last): 2025-12-04T13:21:31.4188699Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4188741Z getattr(self, test_name)() 2025-12-04T13:21:31.4188899Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4188935Z fn() 2025-12-04T13:21:31.4189086Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4189127Z method(*args, **kwargs) 2025-12-04T13:21:31.4189278Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4189320Z method(*args, **kwargs) 2025-12-04T13:21:31.4189470Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4189509Z with policy(): 2025-12-04T13:21:31.4189663Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4189704Z raise RuntimeError(msg) 2025-12-04T13:21:31.4190055Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T13:21:31.4190058Z 2025-12-04T13:21:31.4190134Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4190363Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4190366Z 2025-12-04T13:21:31.4190455Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4190458Z 2025-12-04T13:21:31.4190460Z 2025-12-04T13:21:31.4190535Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4190622Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4190855Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0036e35144f9d74b.xml - 2025-12-04T13:21:31.4190916Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4191175Z FAILED [22.8258s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4191221Z Traceback (most recent call last): 2025-12-04T13:21:31.4191387Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4191430Z getattr(self, test_name)() 2025-12-04T13:21:31.4191590Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4191624Z fn() 2025-12-04T13:21:31.4191776Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4191816Z method(*args, **kwargs) 2025-12-04T13:21:31.4191970Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4192010Z method(*args, **kwargs) 2025-12-04T13:21:31.4192160Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4192198Z with policy(): 2025-12-04T13:21:31.4192358Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4192422Z raise RuntimeError(msg) 2025-12-04T13:21:31.4192772Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320. 2025-12-04T13:21:31.4192774Z 2025-12-04T13:21:31.4192849Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4193077Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4193079Z 2025-12-04T13:21:31.4193168Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4193232Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4193295Z ====================== 1 failed, 18 deselected in 22.96s ======================= 2025-12-04T13:21:31.4193333Z Got exit code 1 2025-12-04T13:21:31.4193511Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda 2025-12-04T13:21:31.4193639Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.4193828Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-81529582b53aae4e.xml 2025-12-04T13:21:31.4193886Z ============================= test session starts ============================== 2025-12-04T13:21:31.4193999Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4194041Z cachedir: .pytest_cache 2025-12-04T13:21:31.4194202Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4194249Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4194290Z configfile: pytest.ini 2025-12-04T13:21:31.4194452Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4194526Z collecting ... collected 60 items / 8 deselected / 52 selected 2025-12-04T13:21:31.4194579Z stepcurrent: skipping 8 already run items. 2025-12-04T13:21:31.4194621Z Running 11 items in this shard 2025-12-04T13:21:31.4194623Z 2025-12-04T13:21:31.4194942Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda I1204 13:10:38.900000 544120 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 544189 2025-12-04T13:21:31.4195097Z I1204 13:10:38.901000 544120 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 544190 2025-12-04T13:21:31.4195252Z I1204 13:10:38.902000 544120 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 544191 2025-12-04T13:21:31.4195402Z I1204 13:10:38.902000 544120 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 544192 2025-12-04T13:21:31.4195984Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4196022Z _warn_cpu_init() 2025-12-04T13:21:31.4196530Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4196612Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4197181Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4197220Z _warn_cpu_init() 2025-12-04T13:21:31.4197786Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4197824Z _warn_cpu_init() 2025-12-04T13:21:31.4198355Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4198416Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4198907Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4198968Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4199554Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4199593Z _warn_cpu_init() 2025-12-04T13:21:31.4199885Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4199972Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4200465Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4200522Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4200826Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4200921Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4201431Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4201487Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4201780Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4201860Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4202149Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4202227Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4202512Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4202592Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4203085Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4203145Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4203632Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4203688Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4203994Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4204069Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4204356Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4204437Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4204724Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4204798Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4205089Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4205142Z return func(*args, **kwargs) 2025-12-04T13:21:31.4205379Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4205433Z return func(*args, **kwargs) 2025-12-04T13:21:31.4205655Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4205697Z return func(*args, **kwargs) 2025-12-04T13:21:31.4205917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4205959Z return func(*args, **kwargs) 2025-12-04T13:21:31.4206184Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4206227Z return func(*args, **kwargs) 2025-12-04T13:21:31.4206448Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4206488Z return func(*args, **kwargs) 2025-12-04T13:21:31.4206706Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4206747Z return func(*args, **kwargs) 2025-12-04T13:21:31.4206967Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4207009Z return func(*args, **kwargs) 2025-12-04T13:21:31.4207229Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4207270Z return func(*args, **kwargs) 2025-12-04T13:21:31.4207416Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4207580Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4207873Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4208040Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4208390Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4208519Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4208798Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4208948Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4209225Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4209388Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4209675Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4209828Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4210105Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4210256Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4210742Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:21:31.4210861Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4211059Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4211422Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4211540Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4211755Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4211920Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4211959Z dist init r=1, world=4 2025-12-04T13:21:31.4212098Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4212257Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4212557Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4212713Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4213000Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4213125Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4213403Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4213552Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4213838Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4214004Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4214280Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4214416Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4214697Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4214846Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4215329Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:21:31.4215446Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4215644Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4216005Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4216121Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4216334Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4216499Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4216537Z dist init r=3, world=4 2025-12-04T13:21:31.4216687Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4216848Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4217136Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4217291Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4217577Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4217702Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4217992Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4218219Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4218495Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4218644Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4218920Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4219058Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4219337Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4219487Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4219967Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:21:31.4220082Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4220280Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4220640Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4220755Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4220979Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4221145Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4221184Z dist init r=0, world=4 2025-12-04T13:21:31.4221322Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4221483Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4221769Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4221924Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4222221Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4222360Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4222653Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4222802Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4223078Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4223225Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4223501Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4223638Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4223915Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4224063Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4224543Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:21:31.4224660Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4224857Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4225218Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4225342Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4225555Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4225719Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4225758Z dist init r=2, world=4 2025-12-04T13:21:31.4226096Z [rank1]:[W1204 13:10:47.508836314 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4226427Z [rank3]:[W1204 13:10:47.528499745 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4226768Z [rank0]:[W1204 13:10:47.552294980 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4227115Z [rank2]:[W1204 13:10:47.576013066 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4227156Z FAILED [23.0242s] [ 9%] 2025-12-04T13:21:31.4227158Z 2025-12-04T13:21:31.4227215Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4227318Z ___ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda ___ 2025-12-04T13:21:31.4227364Z Traceback (most recent call last): 2025-12-04T13:21:31.4227530Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4227576Z self._join_processes(fn) 2025-12-04T13:21:31.4227750Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4227804Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4227982Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4228026Z raise RuntimeError(error) 2025-12-04T13:21:31.4228106Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4228206Z Traceback (most recent call last): 2025-12-04T13:21:31.4228368Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4228411Z getattr(self, test_name)() 2025-12-04T13:21:31.4228569Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4228605Z fn() 2025-12-04T13:21:31.4228757Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4228799Z method(*args, **kwargs) 2025-12-04T13:21:31.4228950Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4228990Z method(*args, **kwargs) 2025-12-04T13:21:31.4229139Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4229177Z with policy(): 2025-12-04T13:21:31.4229341Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4229383Z raise RuntimeError(msg) 2025-12-04T13:21:31.4229741Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:21:31.4229746Z 2025-12-04T13:21:31.4229821Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4230056Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4230059Z 2025-12-04T13:21:31.4230146Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4230148Z 2025-12-04T13:21:31.4230150Z 2025-12-04T13:21:31.4230226Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4230314Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4230573Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-81529582b53aae4e.xml - 2025-12-04T13:21:31.4230646Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4230895Z FAILED [23.0242s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4230941Z Traceback (most recent call last): 2025-12-04T13:21:31.4231106Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4231148Z getattr(self, test_name)() 2025-12-04T13:21:31.4231309Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4231344Z fn() 2025-12-04T13:21:31.4231497Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4231539Z method(*args, **kwargs) 2025-12-04T13:21:31.4231690Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4231730Z method(*args, **kwargs) 2025-12-04T13:21:31.4231882Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4231919Z with policy(): 2025-12-04T13:21:31.4232071Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4232112Z raise RuntimeError(msg) 2025-12-04T13:21:31.4232468Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:21:31.4232472Z 2025-12-04T13:21:31.4232548Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4232781Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4232784Z 2025-12-04T13:21:31.4232872Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4232934Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4232997Z ======================= 1 failed, 8 deselected in 23.16s ======================= 2025-12-04T13:21:31.4233035Z Got exit code 1 2025-12-04T13:21:31.4233084Z Retrying single test... 2025-12-04T13:21:31.4233276Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b083c281fab1e433.xml 2025-12-04T13:21:31.4233335Z ============================= test session starts ============================== 2025-12-04T13:21:31.4233449Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4233490Z cachedir: .pytest_cache 2025-12-04T13:21:31.4233650Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4233695Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4233735Z configfile: pytest.ini 2025-12-04T13:21:31.4233898Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4233974Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4234213Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4234275Z Running 1 items in this shard 2025-12-04T13:21:31.4234287Z 2025-12-04T13:21:31.4234596Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda I1204 13:11:04.310000 545530 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 545599 2025-12-04T13:21:31.4234753Z I1204 13:11:04.311000 545530 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 545600 2025-12-04T13:21:31.4234905Z I1204 13:11:04.312000 545530 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 545601 2025-12-04T13:21:31.4235057Z I1204 13:11:04.313000 545530 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 545602 2025-12-04T13:21:31.4235640Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4235679Z _warn_cpu_init() 2025-12-04T13:21:31.4236176Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4236238Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4236816Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4236855Z _warn_cpu_init() 2025-12-04T13:21:31.4237360Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4237421Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4237995Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4238034Z _warn_cpu_init() 2025-12-04T13:21:31.4238562Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4238620Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4239203Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4239263Z _warn_cpu_init() 2025-12-04T13:21:31.4239561Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4239647Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4240143Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4240202Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4240488Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4240571Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4240858Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4240940Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4241432Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4241490Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4241779Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4241871Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4242159Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4242236Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4242524Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4242597Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4243089Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4243168Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4243468Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4243512Z return func(*args, **kwargs) 2025-12-04T13:21:31.4243796Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4243876Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4244371Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4244431Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4244718Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4244791Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4245019Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4245062Z return func(*args, **kwargs) 2025-12-04T13:21:31.4245287Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4245329Z return func(*args, **kwargs) 2025-12-04T13:21:31.4245552Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4245592Z return func(*args, **kwargs) 2025-12-04T13:21:31.4245813Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4245852Z return func(*args, **kwargs) 2025-12-04T13:21:31.4246073Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4246121Z return func(*args, **kwargs) 2025-12-04T13:21:31.4246342Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4246384Z return func(*args, **kwargs) 2025-12-04T13:21:31.4246603Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4246643Z return func(*args, **kwargs) 2025-12-04T13:21:31.4246861Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4246903Z return func(*args, **kwargs) 2025-12-04T13:21:31.4247047Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4247211Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4247522Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4247688Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4247973Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4248099Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4248418Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4248568Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4248846Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4248993Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4249271Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4249408Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4249688Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4249837Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4250319Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:21:31.4250451Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4250648Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4251011Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4251126Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4251340Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4251507Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4251545Z dist init r=0, world=4 2025-12-04T13:21:31.4251715Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4251886Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4252173Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4252327Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4252613Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4252737Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4253014Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4253161Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4253436Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4253585Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4253862Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4254001Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4254277Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4254425Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4254917Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:21:31.4255034Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4255230Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4255588Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4255704Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4255927Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4256112Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4256151Z dist init r=3, world=4 2025-12-04T13:21:31.4256288Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4256446Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4256733Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4256888Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4257172Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4257297Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4257573Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4257720Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4257998Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4258194Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4258471Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4258606Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4258896Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4259045Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4259523Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:21:31.4259638Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4259833Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4260205Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4260343Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4260554Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4260719Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4260757Z dist init r=1, world=4 2025-12-04T13:21:31.4260896Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4261055Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4261344Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4261499Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4261785Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4261909Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4262186Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4262337Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4262613Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4262761Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4263037Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4263185Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4263462Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4263613Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4264092Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:21:31.4264206Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4264412Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4264789Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4264905Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4265114Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4265281Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4265319Z dist init r=2, world=4 2025-12-04T13:21:31.4265657Z [rank3]:[W1204 13:11:13.152041512 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4265988Z [rank0]:[W1204 13:11:13.160596354 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4266314Z [rank1]:[W1204 13:11:13.177252135 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4266643Z [rank2]:[W1204 13:11:13.315506001 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4266684Z FAILED [23.2271s] [100%] 2025-12-04T13:21:31.4266687Z 2025-12-04T13:21:31.4266746Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4266849Z ___ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda ___ 2025-12-04T13:21:31.4266895Z Traceback (most recent call last): 2025-12-04T13:21:31.4267059Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4267103Z self._join_processes(fn) 2025-12-04T13:21:31.4267293Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4267348Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4267531Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4267575Z raise RuntimeError(error) 2025-12-04T13:21:31.4267657Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4267701Z Traceback (most recent call last): 2025-12-04T13:21:31.4267864Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4267905Z getattr(self, test_name)() 2025-12-04T13:21:31.4268063Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4268097Z fn() 2025-12-04T13:21:31.4268284Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4268325Z method(*args, **kwargs) 2025-12-04T13:21:31.4268492Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4268544Z method(*args, **kwargs) 2025-12-04T13:21:31.4268707Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4268745Z with policy(): 2025-12-04T13:21:31.4268897Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4268938Z raise RuntimeError(msg) 2025-12-04T13:21:31.4269293Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:21:31.4269297Z 2025-12-04T13:21:31.4269374Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4269609Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4269613Z 2025-12-04T13:21:31.4269702Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4269704Z 2025-12-04T13:21:31.4269705Z 2025-12-04T13:21:31.4269779Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4269869Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4270105Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b083c281fab1e433.xml - 2025-12-04T13:21:31.4270167Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4270417Z FAILED [23.2271s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4270465Z Traceback (most recent call last): 2025-12-04T13:21:31.4270631Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4270673Z getattr(self, test_name)() 2025-12-04T13:21:31.4270833Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4270868Z fn() 2025-12-04T13:21:31.4271021Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4271061Z method(*args, **kwargs) 2025-12-04T13:21:31.4271226Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4271266Z method(*args, **kwargs) 2025-12-04T13:21:31.4271417Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4271455Z with policy(): 2025-12-04T13:21:31.4271607Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4271646Z raise RuntimeError(msg) 2025-12-04T13:21:31.4272000Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:21:31.4272002Z 2025-12-04T13:21:31.4272076Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4272313Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4272342Z 2025-12-04T13:21:31.4272431Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4272506Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4272571Z ====================== 1 failed, 18 deselected in 23.37s ======================= 2025-12-04T13:21:31.4272608Z Got exit code 1 2025-12-04T13:21:31.4272649Z Retrying single test... 2025-12-04T13:21:31.4272838Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9ad311a424db7abe.xml 2025-12-04T13:21:31.4272896Z ============================= test session starts ============================== 2025-12-04T13:21:31.4273008Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4273050Z cachedir: .pytest_cache 2025-12-04T13:21:31.4273208Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4273256Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4273298Z configfile: pytest.ini 2025-12-04T13:21:31.4273462Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4273537Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4273762Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4273806Z Running 1 items in this shard 2025-12-04T13:21:31.4273808Z 2025-12-04T13:21:31.4274117Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda I1204 13:11:30.385000 546940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 547009 2025-12-04T13:21:31.4274272Z I1204 13:11:30.386000 546940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 547010 2025-12-04T13:21:31.4274427Z I1204 13:11:30.386000 546940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 547011 2025-12-04T13:21:31.4274579Z I1204 13:11:30.387000 546940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 547012 2025-12-04T13:21:31.4275174Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4275212Z _warn_cpu_init() 2025-12-04T13:21:31.4275708Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4275772Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4276345Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4276382Z _warn_cpu_init() 2025-12-04T13:21:31.4276895Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4276965Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4277540Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4277580Z _warn_cpu_init() 2025-12-04T13:21:31.4278068Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4278129Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4278752Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4278791Z _warn_cpu_init() 2025-12-04T13:21:31.4279083Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4279166Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4279455Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4279536Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4280044Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4280104Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4280392Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4280472Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4281056Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4281128Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4281437Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4281516Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4281800Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4281876Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4282164Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4282239Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4282731Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4282789Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4283081Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4283124Z return func(*args, **kwargs) 2025-12-04T13:21:31.4283412Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4283493Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4283982Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4284041Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4284341Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4284418Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4284646Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4284689Z return func(*args, **kwargs) 2025-12-04T13:21:31.4284912Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4284954Z return func(*args, **kwargs) 2025-12-04T13:21:31.4285176Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4285218Z return func(*args, **kwargs) 2025-12-04T13:21:31.4285448Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4285507Z return func(*args, **kwargs) 2025-12-04T13:21:31.4285728Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4285767Z return func(*args, **kwargs) 2025-12-04T13:21:31.4285986Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4286026Z return func(*args, **kwargs) 2025-12-04T13:21:31.4286246Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4286286Z return func(*args, **kwargs) 2025-12-04T13:21:31.4286508Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4286548Z return func(*args, **kwargs) 2025-12-04T13:21:31.4286696Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4286859Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4287150Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4287305Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4287593Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4287721Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4287999Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4288193Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4288489Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4288639Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4288914Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4289053Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4289332Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4289482Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4289978Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:21:31.4290118Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4290315Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4290676Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4290792Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4291006Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4291170Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4291219Z dist init r=2, world=4 2025-12-04T13:21:31.4291397Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4291560Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4291848Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4292004Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4292289Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4292415Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4292703Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4292852Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4293130Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4293276Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4293552Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4293690Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4293980Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4294152Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4294630Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:21:31.4294745Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4294942Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4295303Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4295419Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4295629Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4295795Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4295834Z dist init r=0, world=4 2025-12-04T13:21:31.4295976Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4296137Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4296426Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4296579Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4296877Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4297001Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4297279Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4297429Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4297704Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4297853Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4298138Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4298329Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4298623Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4298773Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4299252Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:21:31.4299369Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4299565Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4299923Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4300037Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4300249Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4300413Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4300454Z dist init r=3, world=4 2025-12-04T13:21:31.4300591Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4300752Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4301039Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4301204Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4301489Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4301616Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4301892Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4302040Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4302317Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4302477Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4302775Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4302910Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4303189Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4303337Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4303814Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:21:31.4303931Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4304125Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4304486Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4304599Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4304812Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4304977Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4305014Z dist init r=1, world=4 2025-12-04T13:21:31.4305355Z [rank2]:[W1204 13:11:39.969171858 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4305699Z [rank0]:[W1204 13:11:39.982246177 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4306027Z [rank3]:[W1204 13:11:39.990209488 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4306352Z [rank1]:[W1204 13:11:39.166457355 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4306393Z FAILED [23.0269s] [100%] 2025-12-04T13:21:31.4306395Z 2025-12-04T13:21:31.4306453Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4306554Z ___ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda ___ 2025-12-04T13:21:31.4306611Z Traceback (most recent call last): 2025-12-04T13:21:31.4306785Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4306839Z self._join_processes(fn) 2025-12-04T13:21:31.4307013Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4307067Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4307245Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4307289Z raise RuntimeError(error) 2025-12-04T13:21:31.4307369Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4307415Z Traceback (most recent call last): 2025-12-04T13:21:31.4307578Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4307623Z getattr(self, test_name)() 2025-12-04T13:21:31.4307782Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4307819Z fn() 2025-12-04T13:21:31.4307969Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4308011Z method(*args, **kwargs) 2025-12-04T13:21:31.4308191Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4308233Z method(*args, **kwargs) 2025-12-04T13:21:31.4308383Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4308420Z with policy(): 2025-12-04T13:21:31.4308573Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4308615Z raise RuntimeError(msg) 2025-12-04T13:21:31.4308970Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:21:31.4308973Z 2025-12-04T13:21:31.4309048Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4309281Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4309283Z 2025-12-04T13:21:31.4309387Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4309389Z 2025-12-04T13:21:31.4309450Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4309497Z Traceback (most recent call last): 2025-12-04T13:21:31.4309662Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4309704Z getattr(self, test_name)() 2025-12-04T13:21:31.4309863Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4309897Z fn() 2025-12-04T13:21:31.4310049Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4310088Z method(*args, **kwargs) 2025-12-04T13:21:31.4310238Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4310277Z method(*args, **kwargs) 2025-12-04T13:21:31.4310428Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4310493Z with policy(): 2025-12-04T13:21:31.4310644Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4310699Z raise RuntimeError(msg) 2025-12-04T13:21:31.4311050Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:21:31.4311052Z 2025-12-04T13:21:31.4311126Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4311358Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4311360Z 2025-12-04T13:21:31.4311448Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4311452Z 2025-12-04T13:21:31.4311509Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4311557Z Traceback (most recent call last): 2025-12-04T13:21:31.4311719Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4311762Z getattr(self, test_name)() 2025-12-04T13:21:31.4311920Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4311954Z fn() 2025-12-04T13:21:31.4312106Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4312144Z method(*args, **kwargs) 2025-12-04T13:21:31.4312296Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4312335Z method(*args, **kwargs) 2025-12-04T13:21:31.4312486Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4312524Z with policy(): 2025-12-04T13:21:31.4312676Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4312716Z raise RuntimeError(msg) 2025-12-04T13:21:31.4313066Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:21:31.4313068Z 2025-12-04T13:21:31.4313141Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4313382Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4313385Z 2025-12-04T13:21:31.4313472Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4313475Z 2025-12-04T13:21:31.4313477Z 2025-12-04T13:21:31.4313553Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4313641Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4313876Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9ad311a424db7abe.xml - 2025-12-04T13:21:31.4313937Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4314187Z FAILED [23.0269s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4314249Z Traceback (most recent call last): 2025-12-04T13:21:31.4314422Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4314475Z getattr(self, test_name)() 2025-12-04T13:21:31.4314636Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4314672Z fn() 2025-12-04T13:21:31.4314824Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4314865Z method(*args, **kwargs) 2025-12-04T13:21:31.4315014Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4315055Z method(*args, **kwargs) 2025-12-04T13:21:31.4315204Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4315244Z with policy(): 2025-12-04T13:21:31.4315395Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4315437Z raise RuntimeError(msg) 2025-12-04T13:21:31.4315790Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:21:31.4315793Z 2025-12-04T13:21:31.4315865Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4316097Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4316099Z 2025-12-04T13:21:31.4316185Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4316188Z 2025-12-04T13:21:31.4316246Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4316292Z Traceback (most recent call last): 2025-12-04T13:21:31.4318535Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4318581Z getattr(self, test_name)() 2025-12-04T13:21:31.4318752Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4318786Z fn() 2025-12-04T13:21:31.4318939Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4318979Z method(*args, **kwargs) 2025-12-04T13:21:31.4319159Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4319199Z method(*args, **kwargs) 2025-12-04T13:21:31.4319352Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4319389Z with policy(): 2025-12-04T13:21:31.4319542Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4319583Z raise RuntimeError(msg) 2025-12-04T13:21:31.4319937Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:21:31.4319939Z 2025-12-04T13:21:31.4320013Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4320257Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4320273Z 2025-12-04T13:21:31.4320376Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4320378Z 2025-12-04T13:21:31.4320436Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4320480Z Traceback (most recent call last): 2025-12-04T13:21:31.4320643Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4320685Z getattr(self, test_name)() 2025-12-04T13:21:31.4320844Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4320878Z fn() 2025-12-04T13:21:31.4321028Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4321068Z method(*args, **kwargs) 2025-12-04T13:21:31.4321220Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4321260Z method(*args, **kwargs) 2025-12-04T13:21:31.4321410Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4321446Z with policy(): 2025-12-04T13:21:31.4321597Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4321637Z raise RuntimeError(msg) 2025-12-04T13:21:31.4321988Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:21:31.4321991Z 2025-12-04T13:21:31.4322062Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4322296Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4322299Z 2025-12-04T13:21:31.4322384Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4322449Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4322513Z ====================== 1 failed, 18 deselected in 23.16s ======================= 2025-12-04T13:21:31.4322551Z Got exit code 1 2025-12-04T13:21:31.4322732Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:21:31.4322873Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.4323066Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-edddab3c46c1b17a.xml 2025-12-04T13:21:31.4323125Z ============================= test session starts ============================== 2025-12-04T13:21:31.4323239Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4323280Z cachedir: .pytest_cache 2025-12-04T13:21:31.4323439Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4323485Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4323526Z configfile: pytest.ini 2025-12-04T13:21:31.4323690Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4323766Z collecting ... collected 60 items / 9 deselected / 51 selected 2025-12-04T13:21:31.4323818Z stepcurrent: skipping 9 already run items. 2025-12-04T13:21:31.4323861Z Running 10 items in this shard 2025-12-04T13:21:31.4323884Z 2025-12-04T13:21:31.4324221Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda I1204 13:11:55.877000 548350 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 548419 2025-12-04T13:21:31.4324388Z I1204 13:11:55.878000 548350 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 548420 2025-12-04T13:21:31.4324539Z I1204 13:11:55.879000 548350 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 548421 2025-12-04T13:21:31.4324689Z I1204 13:11:55.879000 548350 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 548422 2025-12-04T13:21:31.4325274Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4325313Z _warn_cpu_init() 2025-12-04T13:21:31.4325807Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4325869Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4326445Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4326485Z _warn_cpu_init() 2025-12-04T13:21:31.4326974Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4327046Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4327615Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4327653Z _warn_cpu_init() 2025-12-04T13:21:31.4328142Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4328238Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4328824Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4328885Z _warn_cpu_init() 2025-12-04T13:21:31.4329177Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4329261Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4329752Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4329811Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4330096Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4330179Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4330466Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4330545Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4330832Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4330910Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4331403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4331461Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4331771Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4331814Z return func(*args, **kwargs) 2025-12-04T13:21:31.4332101Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4332180Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4332669Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4332727Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4333022Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4333116Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4333401Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4333480Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4333970Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4334029Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4334316Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4334390Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4334619Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4334661Z return func(*args, **kwargs) 2025-12-04T13:21:31.4334886Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4334926Z return func(*args, **kwargs) 2025-12-04T13:21:31.4335148Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4335188Z return func(*args, **kwargs) 2025-12-04T13:21:31.4335408Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4335448Z return func(*args, **kwargs) 2025-12-04T13:21:31.4335670Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4335709Z return func(*args, **kwargs) 2025-12-04T13:21:31.4335940Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4335983Z return func(*args, **kwargs) 2025-12-04T13:21:31.4336201Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4336242Z return func(*args, **kwargs) 2025-12-04T13:21:31.4336459Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4336500Z return func(*args, **kwargs) 2025-12-04T13:21:31.4336645Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4336810Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4337111Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4337287Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4337572Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4337699Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4337979Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4338128Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4338443Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4338590Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4338865Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4339003Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4339280Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4339430Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4339942Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T13:21:31.4340072Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4340270Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4340664Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4340779Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4340992Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4341157Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4341195Z dist init r=1, world=4 2025-12-04T13:21:31.4341349Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4341531Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4341818Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4341971Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4342256Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4342381Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4342660Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4342809Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4343084Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4343232Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4343508Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4343646Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4343922Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4344070Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4344588Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T13:21:31.4344704Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4344901Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4345289Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4345404Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4345628Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4345817Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4345856Z dist init r=3, world=4 2025-12-04T13:21:31.4345993Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4346153Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4346440Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4346595Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4346879Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4347004Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4347283Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4347429Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4347707Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4347855Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4348130Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4348297Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4348587Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4348736Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4349243Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T13:21:31.4349359Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4349554Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4349956Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4350093Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4350304Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4350468Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4350506Z dist init r=2, world=4 2025-12-04T13:21:31.4350643Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4350802Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4351090Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4351246Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4351529Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4351652Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4351931Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4352079Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4352354Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4352500Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4352774Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4352920Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4353198Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4353348Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4353853Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T13:21:31.4353968Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4354178Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4354588Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4354703Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4354913Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4355077Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4355116Z dist init r=0, world=4 2025-12-04T13:21:31.4355455Z [rank2]:[W1204 13:12:28.290645818 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4355786Z [rank1]:[W1204 13:12:28.306756168 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4356112Z [rank3]:[W1204 13:12:28.309660682 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4356440Z [rank0]:[W1204 13:12:28.377501880 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4356482Z FAILED [46.8445s] [ 10%] 2025-12-04T13:21:31.4356485Z 2025-12-04T13:21:31.4356542Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4356671Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda _ 2025-12-04T13:21:31.4356717Z Traceback (most recent call last): 2025-12-04T13:21:31.4356880Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4356922Z self._join_processes(fn) 2025-12-04T13:21:31.4357105Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4357160Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4357340Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4357384Z raise RuntimeError(error) 2025-12-04T13:21:31.4357465Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4357509Z Traceback (most recent call last): 2025-12-04T13:21:31.4357670Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4357711Z getattr(self, test_name)() 2025-12-04T13:21:31.4357870Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4357904Z fn() 2025-12-04T13:21:31.4358057Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4358107Z method(*args, **kwargs) 2025-12-04T13:21:31.4358311Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4358364Z method(*args, **kwargs) 2025-12-04T13:21:31.4358514Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4358551Z with policy(): 2025-12-04T13:21:31.4358704Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4358743Z raise RuntimeError(msg) 2025-12-04T13:21:31.4359131Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T13:21:31.4359133Z 2025-12-04T13:21:31.4359211Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4359474Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4359477Z 2025-12-04T13:21:31.4359565Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4359568Z 2025-12-04T13:21:31.4359626Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4359672Z Traceback (most recent call last): 2025-12-04T13:21:31.4359834Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4359876Z getattr(self, test_name)() 2025-12-04T13:21:31.4360035Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4360070Z fn() 2025-12-04T13:21:31.4360221Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4360263Z method(*args, **kwargs) 2025-12-04T13:21:31.4360411Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4360451Z method(*args, **kwargs) 2025-12-04T13:21:31.4360601Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4360637Z with policy(): 2025-12-04T13:21:31.4360789Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4360829Z raise RuntimeError(msg) 2025-12-04T13:21:31.4361228Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T13:21:31.4361232Z 2025-12-04T13:21:31.4361305Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4361566Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4361568Z 2025-12-04T13:21:31.4361654Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4361656Z 2025-12-04T13:21:31.4361659Z 2025-12-04T13:21:31.4361735Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4361822Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4362070Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-edddab3c46c1b17a.xml - 2025-12-04T13:21:31.4362160Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4362437Z FAILED [46.8445s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4362483Z Traceback (most recent call last): 2025-12-04T13:21:31.4362645Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4362688Z getattr(self, test_name)() 2025-12-04T13:21:31.4362848Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4362882Z fn() 2025-12-04T13:21:31.4363033Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4363076Z method(*args, **kwargs) 2025-12-04T13:21:31.4363226Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4363266Z method(*args, **kwargs) 2025-12-04T13:21:31.4363415Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4363452Z with policy(): 2025-12-04T13:21:31.4363603Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4363644Z raise RuntimeError(msg) 2025-12-04T13:21:31.4364027Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T13:21:31.4364032Z 2025-12-04T13:21:31.4364104Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4364365Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4364367Z 2025-12-04T13:21:31.4364453Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4364454Z 2025-12-04T13:21:31.4364513Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4364557Z Traceback (most recent call last): 2025-12-04T13:21:31.4364730Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4364771Z getattr(self, test_name)() 2025-12-04T13:21:31.4364932Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4364966Z fn() 2025-12-04T13:21:31.4365116Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4365155Z method(*args, **kwargs) 2025-12-04T13:21:31.4365305Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4365343Z method(*args, **kwargs) 2025-12-04T13:21:31.4365492Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4365529Z with policy(): 2025-12-04T13:21:31.4365681Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4365721Z raise RuntimeError(msg) 2025-12-04T13:21:31.4366114Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T13:21:31.4366137Z 2025-12-04T13:21:31.4366211Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4366468Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4366470Z 2025-12-04T13:21:31.4366556Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4366620Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4366683Z ======================= 1 failed, 9 deselected in 46.99s ======================= 2025-12-04T13:21:31.4366720Z Got exit code 1 2025-12-04T13:21:31.4366761Z Retrying single test... 2025-12-04T13:21:31.4366951Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-59232a91c35e498e.xml 2025-12-04T13:21:31.4367010Z ============================= test session starts ============================== 2025-12-04T13:21:31.4367122Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4367162Z cachedir: .pytest_cache 2025-12-04T13:21:31.4367321Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4367367Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4367408Z configfile: pytest.ini 2025-12-04T13:21:31.4367571Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4367646Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4367903Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4367948Z Running 1 items in this shard 2025-12-04T13:21:31.4367950Z 2025-12-04T13:21:31.4368315Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda I1204 13:12:45.125000 549760 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 549829 2025-12-04T13:21:31.4368471Z I1204 13:12:45.126000 549760 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 549830 2025-12-04T13:21:31.4368636Z I1204 13:12:45.126000 549760 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 549831 2025-12-04T13:21:31.4368788Z I1204 13:12:45.127000 549760 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 549832 2025-12-04T13:21:31.4369370Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4369406Z _warn_cpu_init() 2025-12-04T13:21:31.4369916Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4369994Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4370584Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4370621Z _warn_cpu_init() 2025-12-04T13:21:31.4371112Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4371175Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4371744Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4371781Z _warn_cpu_init() 2025-12-04T13:21:31.4372271Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4372331Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4372902Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4372937Z _warn_cpu_init() 2025-12-04T13:21:31.4373242Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4373328Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4373616Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4373695Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4373978Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4374060Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4374559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4374640Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4374926Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4375006Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4375500Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4375558Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4375848Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4375889Z return func(*args, **kwargs) 2025-12-04T13:21:31.4376376Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4376433Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4376721Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4376800Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4377084Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4377160Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4377452Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4377532Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4378023Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4378083Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4378410Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4378484Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4378733Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4378787Z return func(*args, **kwargs) 2025-12-04T13:21:31.4379025Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4379065Z return func(*args, **kwargs) 2025-12-04T13:21:31.4379287Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4379327Z return func(*args, **kwargs) 2025-12-04T13:21:31.4379548Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4379589Z return func(*args, **kwargs) 2025-12-04T13:21:31.4379809Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4379850Z return func(*args, **kwargs) 2025-12-04T13:21:31.4380071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4380109Z return func(*args, **kwargs) 2025-12-04T13:21:31.4380328Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4380367Z return func(*args, **kwargs) 2025-12-04T13:21:31.4380586Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4380626Z return func(*args, **kwargs) 2025-12-04T13:21:31.4380772Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4380936Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4381226Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4381383Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4381680Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4381807Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4382085Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4382234Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4382512Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4382660Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4382944Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4383100Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4383378Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4383526Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4384038Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T13:21:31.4384156Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4384351Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4384744Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4384860Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4385072Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4385237Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4385276Z dist init r=3, world=4 2025-12-04T13:21:31.4385415Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4385574Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4385870Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4386023Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4386310Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4386434Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4386711Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4386858Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4387143Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4387300Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4387583Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4387719Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4387996Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4388184Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4388694Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T13:21:31.4388809Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4389004Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4389395Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4389512Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4389723Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4389887Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4389925Z dist init r=0, world=4 2025-12-04T13:21:31.4390062Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4390237Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4390524Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4390679Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4390962Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4391086Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4391362Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4391538Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4391831Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4391977Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4392252Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4392387Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4392664Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4392812Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4393318Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T13:21:31.4393433Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4393628Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4394022Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4394134Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4394344Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4394522Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4394561Z dist init r=1, world=4 2025-12-04T13:21:31.4394700Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4394859Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4395145Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4395297Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4395581Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4395713Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4396012Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4396159Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4396436Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4396584Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4396859Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4396996Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4397272Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4397419Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4397928Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T13:21:31.4398042Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4398288Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4398679Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4398809Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4399019Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4399184Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4399222Z dist init r=2, world=4 2025-12-04T13:21:31.4399557Z [rank3]:[W1204 13:13:18.835600518 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4399888Z [rank0]:[W1204 13:13:18.908379949 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4400236Z [rank1]:[W1204 13:13:18.913587296 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4400586Z [rank2]:[W1204 13:13:18.980650819 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4400626Z FAILED [47.0483s] [100%] 2025-12-04T13:21:31.4400628Z 2025-12-04T13:21:31.4400686Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4400815Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda _ 2025-12-04T13:21:31.4400861Z Traceback (most recent call last): 2025-12-04T13:21:31.4401026Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4401069Z self._join_processes(fn) 2025-12-04T13:21:31.4401243Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4401296Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4401474Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4401518Z raise RuntimeError(error) 2025-12-04T13:21:31.4401599Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4401644Z Traceback (most recent call last): 2025-12-04T13:21:31.4401806Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4401847Z getattr(self, test_name)() 2025-12-04T13:21:31.4402006Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4402040Z fn() 2025-12-04T13:21:31.4402192Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4402231Z method(*args, **kwargs) 2025-12-04T13:21:31.4402383Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4402421Z method(*args, **kwargs) 2025-12-04T13:21:31.4402571Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4402606Z with policy(): 2025-12-04T13:21:31.4402769Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4402809Z raise RuntimeError(msg) 2025-12-04T13:21:31.4403195Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T13:21:31.4403199Z 2025-12-04T13:21:31.4403275Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4403537Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4403541Z 2025-12-04T13:21:31.4403629Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4403631Z 2025-12-04T13:21:31.4403633Z 2025-12-04T13:21:31.4403709Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4403807Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4404048Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-59232a91c35e498e.xml - 2025-12-04T13:21:31.4404129Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4404406Z FAILED [47.0483s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4404451Z Traceback (most recent call last): 2025-12-04T13:21:31.4404616Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4404658Z getattr(self, test_name)() 2025-12-04T13:21:31.4404818Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4404853Z fn() 2025-12-04T13:21:31.4405006Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4405046Z method(*args, **kwargs) 2025-12-04T13:21:31.4405200Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4405238Z method(*args, **kwargs) 2025-12-04T13:21:31.4405388Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4405424Z with policy(): 2025-12-04T13:21:31.4405575Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4405616Z raise RuntimeError(msg) 2025-12-04T13:21:31.4406000Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T13:21:31.4406004Z 2025-12-04T13:21:31.4406078Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4406340Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4406342Z 2025-12-04T13:21:31.4406429Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4406491Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4406567Z ====================== 1 failed, 18 deselected in 47.19s ======================= 2025-12-04T13:21:31.4406604Z Got exit code 1 2025-12-04T13:21:31.4406644Z Retrying single test... 2025-12-04T13:21:31.4406834Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d1df5502214bf3b9.xml 2025-12-04T13:21:31.4406893Z ============================= test session starts ============================== 2025-12-04T13:21:31.4407004Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4407045Z cachedir: .pytest_cache 2025-12-04T13:21:31.4407204Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4407250Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4407289Z configfile: pytest.ini 2025-12-04T13:21:31.4407453Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4407528Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4407791Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4407866Z Running 1 items in this shard 2025-12-04T13:21:31.4407868Z 2025-12-04T13:21:31.4408238Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda I1204 13:13:34.730000 551170 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 551239 2025-12-04T13:21:31.4408392Z I1204 13:13:34.731000 551170 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 551240 2025-12-04T13:21:31.4408543Z I1204 13:13:34.731000 551170 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 551241 2025-12-04T13:21:31.4408693Z I1204 13:13:34.732000 551170 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 551242 2025-12-04T13:21:31.4409273Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4409311Z _warn_cpu_init() 2025-12-04T13:21:31.4409808Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4409871Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4410440Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4410477Z _warn_cpu_init() 2025-12-04T13:21:31.4410981Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4411042Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4411610Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4411647Z _warn_cpu_init() 2025-12-04T13:21:31.4412153Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4412224Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4412805Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4412842Z _warn_cpu_init() 2025-12-04T13:21:31.4413134Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4413218Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4413505Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4413586Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4414076Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4414134Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4414424Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4414504Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4414996Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4415053Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4415350Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4415429Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4415713Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4415790Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4416074Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4416146Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4416437Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4416491Z return func(*args, **kwargs) 2025-12-04T13:21:31.4416994Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4417063Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4417349Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4417429Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:21:31.4417916Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4417975Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4418291Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4418364Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4418592Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4418635Z return func(*args, **kwargs) 2025-12-04T13:21:31.4418859Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4418902Z return func(*args, **kwargs) 2025-12-04T13:21:31.4419121Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4419162Z return func(*args, **kwargs) 2025-12-04T13:21:31.4419386Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4419426Z return func(*args, **kwargs) 2025-12-04T13:21:31.4419661Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4419703Z return func(*args, **kwargs) 2025-12-04T13:21:31.4419922Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4419962Z return func(*args, **kwargs) 2025-12-04T13:21:31.4420181Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4420219Z return func(*args, **kwargs) 2025-12-04T13:21:31.4420437Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4420476Z return func(*args, **kwargs) 2025-12-04T13:21:31.4420621Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4420797Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4421110Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4421265Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4421550Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4421677Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4421955Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4422105Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4422381Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4422529Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4422804Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4422942Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4423220Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4423368Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4423892Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T13:21:31.4424010Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4424207Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4424598Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4424712Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4424925Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4425099Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4425165Z dist init r=0, world=4 2025-12-04T13:21:31.4425302Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4425461Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4425746Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4425902Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4426188Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4426315Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4426593Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4426740Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4427017Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4427165Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4427441Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4427578Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4427854Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4428014Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4428560Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T13:21:31.4428679Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4428875Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4429265Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4429407Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4429633Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4429797Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4429835Z dist init r=1, world=4 2025-12-04T13:21:31.4429972Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4430131Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4430419Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4430573Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4430857Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4430983Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4431260Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4431408Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4431682Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4431829Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4432103Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4432253Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4432535Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4432684Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4433195Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T13:21:31.4433309Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4433514Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4433914Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4434037Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4436618Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4436783Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4436823Z dist init r=3, world=4 2025-12-04T13:21:31.4436961Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4437121Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4437408Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4437562Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4437874Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4438000Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4438308Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4438458Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4438735Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4438882Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4439178Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4439315Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4439591Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4439738Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4440261Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T13:21:31.4440378Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4440589Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4440976Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4441160Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4441372Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4441536Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4441575Z dist init r=2, world=4 2025-12-04T13:21:31.4441911Z [rank0]:[W1204 13:14:07.240668545 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4442241Z [rank1]:[W1204 13:14:07.249826778 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4442568Z [rank3]:[W1204 13:14:07.273658546 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4442897Z [rank2]:[W1204 13:14:07.404744246 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4442939Z FAILED [46.8455s] [100%] 2025-12-04T13:21:31.4442941Z 2025-12-04T13:21:31.4443001Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4443129Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda _ 2025-12-04T13:21:31.4443175Z Traceback (most recent call last): 2025-12-04T13:21:31.4443356Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4443401Z self._join_processes(fn) 2025-12-04T13:21:31.4443575Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4443630Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4443807Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4443852Z raise RuntimeError(error) 2025-12-04T13:21:31.4443931Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4443977Z Traceback (most recent call last): 2025-12-04T13:21:31.4444137Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4444181Z getattr(self, test_name)() 2025-12-04T13:21:31.4444338Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4444385Z fn() 2025-12-04T13:21:31.4444536Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4444588Z method(*args, **kwargs) 2025-12-04T13:21:31.4444738Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4444778Z method(*args, **kwargs) 2025-12-04T13:21:31.4444928Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4444982Z with policy(): 2025-12-04T13:21:31.4445134Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4445176Z raise RuntimeError(msg) 2025-12-04T13:21:31.4445562Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T13:21:31.4445565Z 2025-12-04T13:21:31.4445642Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4445906Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4445909Z 2025-12-04T13:21:31.4445998Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4446000Z 2025-12-04T13:21:31.4446002Z 2025-12-04T13:21:31.4446079Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4446167Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4446402Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d1df5502214bf3b9.xml - 2025-12-04T13:21:31.4446465Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4446744Z FAILED [46.8455s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4446791Z Traceback (most recent call last): 2025-12-04T13:21:31.4446955Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4446998Z getattr(self, test_name)() 2025-12-04T13:21:31.4447166Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4447201Z fn() 2025-12-04T13:21:31.4447355Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4447396Z method(*args, **kwargs) 2025-12-04T13:21:31.4447547Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4447587Z method(*args, **kwargs) 2025-12-04T13:21:31.4447736Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4447775Z with policy(): 2025-12-04T13:21:31.4447925Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4447966Z raise RuntimeError(msg) 2025-12-04T13:21:31.4448398Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T13:21:31.4448415Z 2025-12-04T13:21:31.4448491Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4448752Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4448754Z 2025-12-04T13:21:31.4448841Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4448919Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4448983Z ====================== 1 failed, 18 deselected in 46.98s ======================= 2025-12-04T13:21:31.4449021Z Got exit code 1 2025-12-04T13:21:31.4449232Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda 2025-12-04T13:21:31.4449362Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.4449551Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0a49c8ca17fcd339.xml 2025-12-04T13:21:31.4449611Z ============================= test session starts ============================== 2025-12-04T13:21:31.4449722Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4449765Z cachedir: .pytest_cache 2025-12-04T13:21:31.4449923Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4449970Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4450011Z configfile: pytest.ini 2025-12-04T13:21:31.4450175Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4450249Z collecting ... collected 60 items / 10 deselected / 50 selected 2025-12-04T13:21:31.4450304Z stepcurrent: skipping 10 already run items. 2025-12-04T13:21:31.4450348Z Running 9 items in this shard 2025-12-04T13:21:31.4450352Z 2025-12-04T13:21:31.4450682Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda I1204 13:14:24.135000 552580 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 552649 2025-12-04T13:21:31.4450839Z I1204 13:14:24.136000 552580 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 552650 2025-12-04T13:21:31.4451002Z I1204 13:14:24.137000 552580 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 552651 2025-12-04T13:21:31.4451155Z I1204 13:14:24.137000 552580 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 552652 2025-12-04T13:21:31.4451738Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4451779Z _warn_cpu_init() 2025-12-04T13:21:31.4452075Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4452113Z _init_core_state( 2025-12-04T13:21:31.4452618Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4452692Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4453265Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4453312Z _warn_cpu_init() 2025-12-04T13:21:31.4453607Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4453646Z _init_core_state( 2025-12-04T13:21:31.4454139Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4454202Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4454772Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4454810Z _warn_cpu_init() 2025-12-04T13:21:31.4455102Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4455140Z _init_core_state( 2025-12-04T13:21:31.4455640Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4455699Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4456268Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4456306Z _warn_cpu_init() 2025-12-04T13:21:31.4456796Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4456863Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4457356Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4457432Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4457720Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4457764Z return func(*args, **kwargs) 2025-12-04T13:21:31.4458056Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4458095Z _init_core_state( 2025-12-04T13:21:31.4458611Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4458670Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4459159Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4459217Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4459447Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4459488Z return func(*args, **kwargs) 2025-12-04T13:21:31.4459712Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4459753Z return func(*args, **kwargs) 2025-12-04T13:21:31.4459989Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4460030Z return func(*args, **kwargs) 2025-12-04T13:21:31.4460252Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4460293Z return func(*args, **kwargs) 2025-12-04T13:21:31.4460513Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4460553Z return func(*args, **kwargs) 2025-12-04T13:21:31.4460773Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4460813Z return func(*args, **kwargs) 2025-12-04T13:21:31.4461032Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4461083Z return func(*args, **kwargs) 2025-12-04T13:21:31.4461305Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4461361Z return func(*args, **kwargs) 2025-12-04T13:21:31.4461506Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4461669Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4461976Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4462134Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4462421Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4462547Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4462826Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4462974Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4463253Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4463401Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4463681Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4463818Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4464106Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4464256Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4464763Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T13:21:31.4464881Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4465078Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4465473Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4465599Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4465809Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4465975Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4466026Z dist init r=0, world=4 2025-12-04T13:21:31.4466165Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4466326Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4466616Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4466771Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4467054Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4467180Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4467457Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4467606Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4467881Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4468029Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4468344Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4468492Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4468777Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4468926Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4469429Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T13:21:31.4469544Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4469751Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4470148Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4470262Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4470487Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4470653Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4470691Z dist init r=3, world=4 2025-12-04T13:21:31.4470830Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4470993Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4471283Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4471438Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4471723Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4471847Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4472124Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4472272Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4472547Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4472706Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4472982Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4473120Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4473398Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4473548Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4474061Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T13:21:31.4474194Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4474388Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4474771Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4474897Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4475108Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4475273Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4475311Z dist init r=2, world=4 2025-12-04T13:21:31.4475449Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4475607Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4475896Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4476049Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4476335Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4476461Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4476736Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4476896Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4477172Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4477321Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4477596Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4477733Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4478013Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4478219Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4478721Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T13:21:31.4478859Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4479054Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4479437Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4479551Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4479761Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4479924Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4479964Z dist init r=1, world=4 2025-12-04T13:21:31.4480300Z [rank3]:[W1204 13:14:57.892706910 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4480632Z [rank0]:[W1204 13:14:57.893304251 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4480959Z [rank2]:[W1204 13:14:57.935290170 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4481299Z [rank1]:[W1204 13:14:57.059136509 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4481341Z FAILED [47.1405s] [ 11%] 2025-12-04T13:21:31.4481343Z 2025-12-04T13:21:31.4481401Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4481525Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda _ 2025-12-04T13:21:31.4481572Z Traceback (most recent call last): 2025-12-04T13:21:31.4481736Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4481779Z self._join_processes(fn) 2025-12-04T13:21:31.4481952Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4482007Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4482185Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4482229Z raise RuntimeError(error) 2025-12-04T13:21:31.4482310Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4482365Z Traceback (most recent call last): 2025-12-04T13:21:31.4482527Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4482579Z getattr(self, test_name)() 2025-12-04T13:21:31.4482738Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4482772Z fn() 2025-12-04T13:21:31.4482923Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4482977Z method(*args, **kwargs) 2025-12-04T13:21:31.4483128Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4483169Z method(*args, **kwargs) 2025-12-04T13:21:31.4483319Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4483357Z with policy(): 2025-12-04T13:21:31.4483509Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4483550Z raise RuntimeError(msg) 2025-12-04T13:21:31.4483930Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T13:21:31.4483933Z 2025-12-04T13:21:31.4484008Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4484264Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4484267Z 2025-12-04T13:21:31.4484358Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4484361Z 2025-12-04T13:21:31.4484363Z 2025-12-04T13:21:31.4484439Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4484527Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4484763Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0a49c8ca17fcd339.xml - 2025-12-04T13:21:31.4484824Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4485108Z FAILED [47.1405s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4485154Z Traceback (most recent call last): 2025-12-04T13:21:31.4485319Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4485361Z getattr(self, test_name)() 2025-12-04T13:21:31.4485521Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4485556Z fn() 2025-12-04T13:21:31.4485707Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4485748Z method(*args, **kwargs) 2025-12-04T13:21:31.4485900Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4485939Z method(*args, **kwargs) 2025-12-04T13:21:31.4486090Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4486126Z with policy(): 2025-12-04T13:21:31.4486288Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4486338Z raise RuntimeError(msg) 2025-12-04T13:21:31.4486716Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T13:21:31.4486730Z 2025-12-04T13:21:31.4486805Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4487062Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4487064Z 2025-12-04T13:21:31.4487152Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4487215Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4487282Z ====================== 1 failed, 10 deselected in 47.28s ======================= 2025-12-04T13:21:31.4487319Z Got exit code 1 2025-12-04T13:21:31.4487360Z Retrying single test... 2025-12-04T13:21:31.4487549Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-52a6d20b2aed2cc3.xml 2025-12-04T13:21:31.4487608Z ============================= test session starts ============================== 2025-12-04T13:21:31.4487721Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4487762Z cachedir: .pytest_cache 2025-12-04T13:21:31.4487920Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4487966Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4488006Z configfile: pytest.ini 2025-12-04T13:21:31.4488215Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4488291Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4488540Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4488585Z Running 1 items in this shard 2025-12-04T13:21:31.4488588Z 2025-12-04T13:21:31.4488931Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda I1204 13:15:13.982000 553990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 554059 2025-12-04T13:21:31.4489087Z I1204 13:15:13.983000 553990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 554060 2025-12-04T13:21:31.4489241Z I1204 13:15:13.984000 553990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 554061 2025-12-04T13:21:31.4489393Z I1204 13:15:13.984000 553990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 554062 2025-12-04T13:21:31.4489978Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4490017Z _warn_cpu_init() 2025-12-04T13:21:31.4490334Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4490383Z _init_core_state( 2025-12-04T13:21:31.4490876Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4490950Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4491522Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4491560Z _warn_cpu_init() 2025-12-04T13:21:31.4491853Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4491890Z _init_core_state( 2025-12-04T13:21:31.4492385Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4492447Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4493013Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4493051Z _warn_cpu_init() 2025-12-04T13:21:31.4493345Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4493381Z _init_core_state( 2025-12-04T13:21:31.4493881Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4493940Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4494510Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4494548Z _warn_cpu_init() 2025-12-04T13:21:31.4495047Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4495116Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4495600Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4495672Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4496159Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4496216Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4496511Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4496549Z _init_core_state( 2025-12-04T13:21:31.4497040Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4497099Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4497390Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4497432Z return func(*args, **kwargs) 2025-12-04T13:21:31.4497661Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4497704Z return func(*args, **kwargs) 2025-12-04T13:21:31.4497936Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4497978Z return func(*args, **kwargs) 2025-12-04T13:21:31.4498241Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4498283Z return func(*args, **kwargs) 2025-12-04T13:21:31.4498504Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4498544Z return func(*args, **kwargs) 2025-12-04T13:21:31.4498764Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4498805Z return func(*args, **kwargs) 2025-12-04T13:21:31.4499026Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4499079Z return func(*args, **kwargs) 2025-12-04T13:21:31.4499299Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4499354Z return func(*args, **kwargs) 2025-12-04T13:21:31.4499574Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4499630Z return func(*args, **kwargs) 2025-12-04T13:21:31.4499776Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4499939Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4500230Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4500385Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4500671Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4500797Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4501076Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4501226Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4501502Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4501651Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4501927Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4502078Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4502355Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4502506Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4503012Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T13:21:31.4503130Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4503339Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4503722Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4503850Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4504071Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4504238Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4504276Z dist init r=1, world=4 2025-12-04T13:21:31.4504417Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4504577Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4504866Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4505020Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4505306Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4505432Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4505708Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4505858Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4506134Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4506281Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4506574Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4506711Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4506994Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4507144Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4507655Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T13:21:31.4507781Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4507976Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4508384Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4508522Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4508735Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4508899Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4508939Z dist init r=3, world=4 2025-12-04T13:21:31.4509076Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4509238Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4509527Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4509680Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4509965Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4510089Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4510364Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4510511Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4510801Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4510949Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4511224Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4511360Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4511639Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4511789Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4512304Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T13:21:31.4512432Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4512639Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4513023Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4513136Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4513348Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4513512Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4513550Z dist init r=2, world=4 2025-12-04T13:21:31.4513688Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4513850Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4514139Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4514294Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4514576Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4514702Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4514987Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4515136Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4515412Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4515559Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4515834Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4515972Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4516262Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4516419Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4516920Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T13:21:31.4517045Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4517241Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4517623Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4517735Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4517947Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4518110Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4518186Z dist init r=0, world=4 2025-12-04T13:21:31.4518524Z [rank3]:[W1204 13:15:47.865831727 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4518856Z [rank1]:[W1204 13:15:47.870142828 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4519184Z [rank2]:[W1204 13:15:47.966213464 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4519525Z [rank0]:[W1204 13:15:47.967603962 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4519567Z FAILED [47.4467s] [100%] 2025-12-04T13:21:31.4519571Z 2025-12-04T13:21:31.4519627Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4519752Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda _ 2025-12-04T13:21:31.4519798Z Traceback (most recent call last): 2025-12-04T13:21:31.4519963Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4520005Z self._join_processes(fn) 2025-12-04T13:21:31.4520180Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4520234Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4520422Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4520479Z raise RuntimeError(error) 2025-12-04T13:21:31.4520559Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4520605Z Traceback (most recent call last): 2025-12-04T13:21:31.4520765Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4520829Z getattr(self, test_name)() 2025-12-04T13:21:31.4520989Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4521024Z fn() 2025-12-04T13:21:31.4521177Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4521217Z method(*args, **kwargs) 2025-12-04T13:21:31.4521369Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4521409Z method(*args, **kwargs) 2025-12-04T13:21:31.4521560Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4521598Z with policy(): 2025-12-04T13:21:31.4521749Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4521791Z raise RuntimeError(msg) 2025-12-04T13:21:31.4522171Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T13:21:31.4522175Z 2025-12-04T13:21:31.4522249Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4522507Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4522511Z 2025-12-04T13:21:31.4522599Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4522601Z 2025-12-04T13:21:31.4522602Z 2025-12-04T13:21:31.4522679Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4522768Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4523004Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-52a6d20b2aed2cc3.xml - 2025-12-04T13:21:31.4523074Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4523347Z FAILED [47.4467s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4523395Z Traceback (most recent call last): 2025-12-04T13:21:31.4523561Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4523602Z getattr(self, test_name)() 2025-12-04T13:21:31.4523763Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4523799Z fn() 2025-12-04T13:21:31.4523950Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4523991Z method(*args, **kwargs) 2025-12-04T13:21:31.4524141Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4524192Z method(*args, **kwargs) 2025-12-04T13:21:31.4524342Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4524390Z with policy(): 2025-12-04T13:21:31.4524542Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4524582Z raise RuntimeError(msg) 2025-12-04T13:21:31.4524960Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T13:21:31.4524973Z 2025-12-04T13:21:31.4525050Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4525305Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4525308Z 2025-12-04T13:21:31.4525396Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4525459Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4525522Z ====================== 1 failed, 18 deselected in 47.61s ======================= 2025-12-04T13:21:31.4525559Z Got exit code 1 2025-12-04T13:21:31.4525600Z Retrying single test... 2025-12-04T13:21:31.4525791Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5aef99b217eb6286.xml 2025-12-04T13:21:31.4525850Z ============================= test session starts ============================== 2025-12-04T13:21:31.4525963Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4526005Z cachedir: .pytest_cache 2025-12-04T13:21:31.4526163Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4526209Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4526249Z configfile: pytest.ini 2025-12-04T13:21:31.4526412Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4526487Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4526738Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4526782Z Running 1 items in this shard 2025-12-04T13:21:31.4526796Z 2025-12-04T13:21:31.4527126Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda I1204 13:16:04.049000 555400 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 555469 2025-12-04T13:21:31.4527281Z I1204 13:16:04.050000 555400 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 555470 2025-12-04T13:21:31.4527433Z I1204 13:16:04.051000 555400 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 555471 2025-12-04T13:21:31.4527583Z I1204 13:16:04.052000 555400 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 555472 2025-12-04T13:21:31.4528226Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4528276Z _warn_cpu_init() 2025-12-04T13:21:31.4528574Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4528611Z _init_core_state( 2025-12-04T13:21:31.4529102Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4529178Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4529749Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4529788Z _warn_cpu_init() 2025-12-04T13:21:31.4530081Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4530120Z _init_core_state( 2025-12-04T13:21:31.4530614Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4530676Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4531245Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4531282Z _warn_cpu_init() 2025-12-04T13:21:31.4531588Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4531625Z _init_core_state( 2025-12-04T13:21:31.4532113Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4532174Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4532752Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4532799Z _warn_cpu_init() 2025-12-04T13:21:31.4533287Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4533355Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4533642Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4533685Z return func(*args, **kwargs) 2025-12-04T13:21:31.4534172Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4534230Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4534526Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:21:31.4534563Z _init_core_state( 2025-12-04T13:21:31.4535050Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4535109Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4535594Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.4535652Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.4535892Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4535936Z return func(*args, **kwargs) 2025-12-04T13:21:31.4536160Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4536202Z return func(*args, **kwargs) 2025-12-04T13:21:31.4536425Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4536467Z return func(*args, **kwargs) 2025-12-04T13:21:31.4536687Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4536727Z return func(*args, **kwargs) 2025-12-04T13:21:31.4536947Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4536997Z return func(*args, **kwargs) 2025-12-04T13:21:31.4537216Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4537273Z return func(*args, **kwargs) 2025-12-04T13:21:31.4537493Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4537543Z return func(*args, **kwargs) 2025-12-04T13:21:31.4537762Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4537802Z return func(*args, **kwargs) 2025-12-04T13:21:31.4537947Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4538112Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4538438Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4538593Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4538881Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4539009Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4539287Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4539436Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4539713Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4539863Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4540159Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4540297Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4540577Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4540724Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4541232Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T13:21:31.4541361Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4541572Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4541956Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4542089Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4542301Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4542467Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4542507Z dist init r=1, world=4 2025-12-04T13:21:31.4542645Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4542805Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4543095Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4543250Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4543534Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4543660Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4543936Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4544084Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4544370Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4544517Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4544793Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4544928Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4545212Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4545363Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4545878Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160. 2025-12-04T13:21:31.4546006Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4546201Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4546597Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4546713Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4546924Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4547089Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4547129Z dist init r=2, world=4 2025-12-04T13:21:31.4547267Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4547427Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4547715Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4547869Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4548189Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4548314Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4548604Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4548754Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4549028Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4549175Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4549449Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4549588Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4549878Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4550040Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4550544Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512. 2025-12-04T13:21:31.4550671Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4550868Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4551249Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4551364Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4551575Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4551740Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4551780Z dist init r=3, world=4 2025-12-04T13:21:31.4551918Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4552078Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4552368Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4552522Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4552820Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4552946Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4553222Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4553370Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4553645Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4553791Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4554088Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4554224Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4554516Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4554665Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4555178Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256. 2025-12-04T13:21:31.4555292Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4555487Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4555870Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4555984Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4556196Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4556360Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4556400Z dist init r=0, world=4 2025-12-04T13:21:31.4556738Z [rank1]:[W1204 13:16:37.802322408 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4557071Z [rank2]:[W1204 13:16:37.936681677 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4557410Z [rank3]:[W1204 13:16:37.945668464 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4557738Z [rank0]:[W1204 13:16:37.959967906 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4557779Z FAILED [47.1474s] [100%] 2025-12-04T13:21:31.4557781Z 2025-12-04T13:21:31.4557840Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4557963Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda _ 2025-12-04T13:21:31.4558009Z Traceback (most recent call last): 2025-12-04T13:21:31.4558206Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4558250Z self._join_processes(fn) 2025-12-04T13:21:31.4558441Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4558507Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4558685Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4560900Z raise RuntimeError(error) 2025-12-04T13:21:31.4560988Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4561064Z Traceback (most recent call last): 2025-12-04T13:21:31.4561236Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4561280Z getattr(self, test_name)() 2025-12-04T13:21:31.4561442Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4561479Z fn() 2025-12-04T13:21:31.4561631Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4561675Z method(*args, **kwargs) 2025-12-04T13:21:31.4561826Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4561868Z method(*args, **kwargs) 2025-12-04T13:21:31.4562018Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4562058Z with policy(): 2025-12-04T13:21:31.4562210Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4562251Z raise RuntimeError(msg) 2025-12-04T13:21:31.4562634Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T13:21:31.4562639Z 2025-12-04T13:21:31.4562715Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4562973Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4562976Z 2025-12-04T13:21:31.4563066Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4563068Z 2025-12-04T13:21:31.4563070Z 2025-12-04T13:21:31.4563150Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4563272Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4563510Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5aef99b217eb6286.xml - 2025-12-04T13:21:31.4563572Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4563844Z FAILED [47.1474s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4563891Z Traceback (most recent call last): 2025-12-04T13:21:31.4564055Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4564098Z getattr(self, test_name)() 2025-12-04T13:21:31.4564259Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4564294Z fn() 2025-12-04T13:21:31.4564456Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4564497Z method(*args, **kwargs) 2025-12-04T13:21:31.4564660Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4564700Z method(*args, **kwargs) 2025-12-04T13:21:31.4564849Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4564886Z with policy(): 2025-12-04T13:21:31.4565049Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4565089Z raise RuntimeError(msg) 2025-12-04T13:21:31.4565471Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376. 2025-12-04T13:21:31.4565473Z 2025-12-04T13:21:31.4565550Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4565807Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4565810Z 2025-12-04T13:21:31.4565896Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4565961Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4566024Z ====================== 1 failed, 18 deselected in 47.29s ======================= 2025-12-04T13:21:31.4566061Z Got exit code 1 2025-12-04T13:21:31.4566266Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda 2025-12-04T13:21:31.4566396Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.4566586Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f32c471be9ed6c85.xml 2025-12-04T13:21:31.4566644Z ============================= test session starts ============================== 2025-12-04T13:21:31.4566758Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4566801Z cachedir: .pytest_cache 2025-12-04T13:21:31.4566959Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4567007Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4567046Z configfile: pytest.ini 2025-12-04T13:21:31.4567221Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4567298Z collecting ... collected 60 items / 11 deselected / 49 selected 2025-12-04T13:21:31.4567353Z stepcurrent: skipping 11 already run items. 2025-12-04T13:21:31.4567396Z Running 8 items in this shard 2025-12-04T13:21:31.4567400Z 2025-12-04T13:21:31.4567726Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda I1204 13:16:53.929000 556810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 556879 2025-12-04T13:21:31.4567883Z I1204 13:16:53.930000 556810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 556880 2025-12-04T13:21:31.4568034Z I1204 13:16:53.930000 556810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 556881 2025-12-04T13:21:31.4568223Z I1204 13:16:53.931000 556810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 556882 2025-12-04T13:21:31.4568819Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4568871Z _warn_cpu_init() 2025-12-04T13:21:31.4569454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4569493Z _warn_cpu_init() 2025-12-04T13:21:31.4570057Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4570095Z _warn_cpu_init() 2025-12-04T13:21:31.4570663Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4570700Z _warn_cpu_init() 2025-12-04T13:21:31.4570989Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4571033Z return func(*args, **kwargs) 2025-12-04T13:21:31.4571175Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4571340Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4571647Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4571803Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4572088Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4572214Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4572494Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4572643Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4572930Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4573089Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4573363Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4573509Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4573787Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4573937Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4574431Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 74240 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:21:31.4574548Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4574745Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4575125Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4575240Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4575453Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4575619Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4575657Z dist init r=1, world=4 2025-12-04T13:21:31.4575807Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4575967Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4576255Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4576409Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4576695Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4576819Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4577105Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4577265Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4577540Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4577701Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4577976Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4578115Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4578453Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4578602Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4579098Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:21:31.4579213Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4579409Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4579787Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4579903Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4580129Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4580294Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4580333Z dist init r=3, world=4 2025-12-04T13:21:31.4580471Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4580631Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4580917Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4581073Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4581372Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4581509Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4581786Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4581935Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4582226Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4582373Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4582648Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4582784Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4583062Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4583210Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4583704Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:21:31.4583819Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4584013Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4584408Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4584524Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4584735Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4584901Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4584939Z dist init r=0, world=4 2025-12-04T13:21:31.4585077Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4585238Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4585526Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4585692Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4585986Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4586110Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4586402Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4586552Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4586830Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4586979Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4587253Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4587390Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4587668Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4587817Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4588354Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 74240 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:21:31.4588469Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4588679Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4589059Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4589174Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4589386Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4589552Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4589590Z dist init r=2, world=4 2025-12-04T13:21:31.4589940Z [rank0]:[W1204 13:17:02.814163560 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4589981Z FAILED [9.8157s] [ 12%] 2025-12-04T13:21:31.4589997Z 2025-12-04T13:21:31.4590053Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4590167Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda _ 2025-12-04T13:21:31.4590213Z Traceback (most recent call last): 2025-12-04T13:21:31.4590377Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4590436Z self._join_processes(fn) 2025-12-04T13:21:31.4590609Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4590662Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4590842Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4590886Z raise RuntimeError(error) 2025-12-04T13:21:31.4590968Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4591014Z Traceback (most recent call last): 2025-12-04T13:21:31.4591176Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4591218Z getattr(self, test_name)() 2025-12-04T13:21:31.4591379Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4591414Z fn() 2025-12-04T13:21:31.4591567Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4591608Z method(*args, **kwargs) 2025-12-04T13:21:31.4591761Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4591802Z method(*args, **kwargs) 2025-12-04T13:21:31.4591952Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4591989Z with policy(): 2025-12-04T13:21:31.4592141Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4592182Z raise RuntimeError(msg) 2025-12-04T13:21:31.4592550Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:21:31.4592554Z 2025-12-04T13:21:31.4592640Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4592891Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4592894Z 2025-12-04T13:21:31.4592982Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4592985Z 2025-12-04T13:21:31.4592987Z 2025-12-04T13:21:31.4593063Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4593151Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4593387Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f32c471be9ed6c85.xml - 2025-12-04T13:21:31.4593448Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4593726Z FAILED [9.8157s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4593773Z Traceback (most recent call last): 2025-12-04T13:21:31.4593950Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4593993Z getattr(self, test_name)() 2025-12-04T13:21:31.4594153Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4594199Z fn() 2025-12-04T13:21:31.4594351Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4594389Z method(*args, **kwargs) 2025-12-04T13:21:31.4594541Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4594580Z method(*args, **kwargs) 2025-12-04T13:21:31.4594731Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4594768Z with policy(): 2025-12-04T13:21:31.4594921Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4594961Z raise RuntimeError(msg) 2025-12-04T13:21:31.4595327Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:21:31.4595330Z 2025-12-04T13:21:31.4595405Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4595655Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4595657Z 2025-12-04T13:21:31.4595745Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4595809Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4595871Z ======================= 1 failed, 11 deselected in 9.98s ======================= 2025-12-04T13:21:31.4595908Z Got exit code 1 2025-12-04T13:21:31.4595949Z Retrying single test... 2025-12-04T13:21:31.4596138Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4f90149723f3b30c.xml 2025-12-04T13:21:31.4596197Z ============================= test session starts ============================== 2025-12-04T13:21:31.4596321Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4596364Z cachedir: .pytest_cache 2025-12-04T13:21:31.4596522Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4596568Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4596609Z configfile: pytest.ini 2025-12-04T13:21:31.4596773Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4596848Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4597090Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4597137Z Running 1 items in this shard 2025-12-04T13:21:31.4597139Z 2025-12-04T13:21:31.4597461Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda I1204 13:17:06.739000 557212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 557281 2025-12-04T13:21:31.4597626Z I1204 13:17:06.740000 557212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 557282 2025-12-04T13:21:31.4597788Z I1204 13:17:06.741000 557212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 557283 2025-12-04T13:21:31.4597938Z I1204 13:17:06.741000 557212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 557284 2025-12-04T13:21:31.4598568Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4598623Z _warn_cpu_init() 2025-12-04T13:21:31.4599194Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4599233Z _warn_cpu_init() 2025-12-04T13:21:31.4599800Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4599837Z _warn_cpu_init() 2025-12-04T13:21:31.4600403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4600440Z _warn_cpu_init() 2025-12-04T13:21:31.4600746Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4600789Z return func(*args, **kwargs) 2025-12-04T13:21:31.4600934Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4601098Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4601387Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4601544Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4601830Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4601967Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4602247Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4602416Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4602693Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4602852Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4603132Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4603270Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4603548Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4603697Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4604191Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:21:31.4604309Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4604505Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4604881Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4605008Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4605222Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4605388Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4605428Z dist init r=1, world=4 2025-12-04T13:21:31.4605566Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4605724Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4606013Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4606166Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4606463Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4606600Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4606876Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4607035Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4607313Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4607461Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4607739Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4607875Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4608206Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4608355Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4608847Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 78336 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:21:31.4608963Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4609160Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4609547Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4609662Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4609875Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4610039Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4610079Z dist init r=2, world=4 2025-12-04T13:21:31.4610216Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4610376Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4610677Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4610845Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4611131Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4611269Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4611546Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4611694Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4611971Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4612118Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4612398Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4612534Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4612812Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4612962Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4613454Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:21:31.4613579Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4613775Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4614150Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4614265Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4614478Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4614646Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4614684Z dist init r=3, world=4 2025-12-04T13:21:31.4614833Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4615002Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4615291Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4615455Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4615742Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4615865Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4616143Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4616293Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4616568Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4616717Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4616995Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4617132Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4617409Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4617560Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4618062Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 76288 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:21:31.4618227Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4618424Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4618796Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4618912Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4619146Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4619310Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4619363Z dist init r=0, world=4 2025-12-04T13:21:31.4619699Z [rank0]:[W1204 13:17:15.721097705 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4619755Z FAILED [10.1155s] [100%] 2025-12-04T13:21:31.4619758Z 2025-12-04T13:21:31.4619815Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4619929Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda _ 2025-12-04T13:21:31.4619976Z Traceback (most recent call last): 2025-12-04T13:21:31.4620139Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4620184Z self._join_processes(fn) 2025-12-04T13:21:31.4620356Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4620411Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4620588Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4620634Z raise RuntimeError(error) 2025-12-04T13:21:31.4620715Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4620760Z Traceback (most recent call last): 2025-12-04T13:21:31.4620921Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4620963Z getattr(self, test_name)() 2025-12-04T13:21:31.4621122Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4621156Z fn() 2025-12-04T13:21:31.4621308Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4621348Z method(*args, **kwargs) 2025-12-04T13:21:31.4621498Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4621540Z method(*args, **kwargs) 2025-12-04T13:21:31.4621690Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4621727Z with policy(): 2025-12-04T13:21:31.4621891Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4621932Z raise RuntimeError(msg) 2025-12-04T13:21:31.4622304Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:21:31.4622308Z 2025-12-04T13:21:31.4622384Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4622630Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4622634Z 2025-12-04T13:21:31.4622721Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4622725Z 2025-12-04T13:21:31.4622785Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4622830Z Traceback (most recent call last): 2025-12-04T13:21:31.4623004Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4623676Z getattr(self, test_name)() 2025-12-04T13:21:31.4623835Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4623869Z fn() 2025-12-04T13:21:31.4624020Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4624069Z method(*args, **kwargs) 2025-12-04T13:21:31.4624220Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4624258Z method(*args, **kwargs) 2025-12-04T13:21:31.4624408Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4624445Z with policy(): 2025-12-04T13:21:31.4624598Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4624640Z raise RuntimeError(msg) 2025-12-04T13:21:31.4625003Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 78336 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:21:31.4625005Z 2025-12-04T13:21:31.4625080Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4625327Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4625331Z 2025-12-04T13:21:31.4625418Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4625420Z 2025-12-04T13:21:31.4625479Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4625524Z Traceback (most recent call last): 2025-12-04T13:21:31.4625686Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4625728Z getattr(self, test_name)() 2025-12-04T13:21:31.4625886Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4625921Z fn() 2025-12-04T13:21:31.4626070Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4626109Z method(*args, **kwargs) 2025-12-04T13:21:31.4626268Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4626307Z method(*args, **kwargs) 2025-12-04T13:21:31.4626458Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4626495Z with policy(): 2025-12-04T13:21:31.4626646Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4626686Z raise RuntimeError(msg) 2025-12-04T13:21:31.4627050Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:21:31.4627053Z 2025-12-04T13:21:31.4627125Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4627372Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4627384Z 2025-12-04T13:21:31.4627471Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4627483Z 2025-12-04T13:21:31.4627486Z 2025-12-04T13:21:31.4627561Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4627649Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4627882Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4f90149723f3b30c.xml - 2025-12-04T13:21:31.4627953Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4628262Z FAILED [10.1155s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4628308Z Traceback (most recent call last): 2025-12-04T13:21:31.4628471Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4628514Z getattr(self, test_name)() 2025-12-04T13:21:31.4628673Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4628707Z fn() 2025-12-04T13:21:31.4628857Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4628898Z method(*args, **kwargs) 2025-12-04T13:21:31.4629048Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4629087Z method(*args, **kwargs) 2025-12-04T13:21:31.4629237Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4629274Z with policy(): 2025-12-04T13:21:31.4629426Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4629467Z raise RuntimeError(msg) 2025-12-04T13:21:31.4629834Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:21:31.4629838Z 2025-12-04T13:21:31.4629910Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4630172Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4630174Z 2025-12-04T13:21:31.4630260Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4630263Z 2025-12-04T13:21:31.4630322Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4630368Z Traceback (most recent call last): 2025-12-04T13:21:31.4630530Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4630571Z getattr(self, test_name)() 2025-12-04T13:21:31.4630730Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4630764Z fn() 2025-12-04T13:21:31.4630915Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4630954Z method(*args, **kwargs) 2025-12-04T13:21:31.4631104Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4631142Z method(*args, **kwargs) 2025-12-04T13:21:31.4631305Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4631354Z with policy(): 2025-12-04T13:21:31.4631505Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4631546Z raise RuntimeError(msg) 2025-12-04T13:21:31.4631909Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 78336 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:21:31.4631923Z 2025-12-04T13:21:31.4631997Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4632244Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4632246Z 2025-12-04T13:21:31.4632334Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4632336Z 2025-12-04T13:21:31.4632393Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4632438Z Traceback (most recent call last): 2025-12-04T13:21:31.4632599Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4632642Z getattr(self, test_name)() 2025-12-04T13:21:31.4632800Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4632834Z fn() 2025-12-04T13:21:31.4632987Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4633025Z method(*args, **kwargs) 2025-12-04T13:21:31.4633176Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4633215Z method(*args, **kwargs) 2025-12-04T13:21:31.4633365Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4633400Z with policy(): 2025-12-04T13:21:31.4633551Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4633591Z raise RuntimeError(msg) 2025-12-04T13:21:31.4633971Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:21:31.4633973Z 2025-12-04T13:21:31.4634046Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4634293Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4634296Z 2025-12-04T13:21:31.4634382Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4634446Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4634510Z ====================== 1 failed, 18 deselected in 10.28s ======================= 2025-12-04T13:21:31.4634547Z Got exit code 1 2025-12-04T13:21:31.4634587Z Retrying single test... 2025-12-04T13:21:31.4634778Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d59e0d1f8082da7d.xml 2025-12-04T13:21:31.4634837Z ============================= test session starts ============================== 2025-12-04T13:21:31.4634959Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4635010Z cachedir: .pytest_cache 2025-12-04T13:21:31.4635168Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4635215Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4635254Z configfile: pytest.ini 2025-12-04T13:21:31.4635416Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4635501Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4635744Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4635788Z Running 1 items in this shard 2025-12-04T13:21:31.4635790Z 2025-12-04T13:21:31.4636112Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda I1204 13:17:19.485000 557614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 557683 2025-12-04T13:21:31.4636268Z I1204 13:17:19.486000 557614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 557684 2025-12-04T13:21:31.4636421Z I1204 13:17:19.486000 557614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 557685 2025-12-04T13:21:31.4636573Z I1204 13:17:19.487000 557614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 557686 2025-12-04T13:21:31.4637157Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4637196Z _warn_cpu_init() 2025-12-04T13:21:31.4637764Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4637802Z _warn_cpu_init() 2025-12-04T13:21:31.4638420Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4638458Z _warn_cpu_init() 2025-12-04T13:21:31.4639024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4639061Z _warn_cpu_init() 2025-12-04T13:21:31.4639368Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4639424Z return func(*args, **kwargs) 2025-12-04T13:21:31.4639567Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4639729Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4640017Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4640185Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4640471Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4640598Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4640874Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4641023Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4641303Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4641451Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4641728Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4641864Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4642141Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4642299Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4642795Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:21:31.4642912Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4643107Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4643485Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4643610Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4643833Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4643998Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4644038Z dist init r=1, world=4 2025-12-04T13:21:31.4644187Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4644346Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4644634Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4644786Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4645071Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4645194Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4645471Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4645619Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4645894Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4646044Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4646318Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4646455Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4646746Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4646894Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4647385Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 74240 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:21:31.4647501Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4647697Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4648079Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4648239Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4648451Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4648635Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4648673Z dist init r=0, world=4 2025-12-04T13:21:31.4648812Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4648972Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4649259Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4649412Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4649697Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4649821Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4650096Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4650244Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4650519Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4650668Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4650958Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4651094Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4651371Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4651519Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4652011Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 64000 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:21:31.4652167Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4652421Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4652795Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4652919Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4653131Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4653296Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4653335Z dist init r=2, world=4 2025-12-04T13:21:31.4653473Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4653632Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4653918Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4654074Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4654358Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4654482Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4654758Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4654906Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4655192Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4655340Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4655618Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4655754Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4656031Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4656181Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4656679Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:21:31.4656803Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4657000Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4657383Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4657498Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4657708Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4657873Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4657911Z dist init r=3, world=4 2025-12-04T13:21:31.4658281Z [rank0]:[W1204 13:17:27.475103950 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4658322Z FAILED [10.0158s] [100%] 2025-12-04T13:21:31.4658324Z 2025-12-04T13:21:31.4658381Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4658495Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda _ 2025-12-04T13:21:31.4658542Z Traceback (most recent call last): 2025-12-04T13:21:31.4658705Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4658748Z self._join_processes(fn) 2025-12-04T13:21:31.4658921Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4658975Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4659153Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4659196Z raise RuntimeError(error) 2025-12-04T13:21:31.4659289Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4659334Z Traceback (most recent call last): 2025-12-04T13:21:31.4659496Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4659538Z getattr(self, test_name)() 2025-12-04T13:21:31.4659696Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4659729Z fn() 2025-12-04T13:21:31.4659880Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4659921Z method(*args, **kwargs) 2025-12-04T13:21:31.4660071Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4660109Z method(*args, **kwargs) 2025-12-04T13:21:31.4660261Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4660298Z with policy(): 2025-12-04T13:21:31.4660463Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4660515Z raise RuntimeError(msg) 2025-12-04T13:21:31.4660887Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:21:31.4660903Z 2025-12-04T13:21:31.4660979Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4661226Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4661228Z 2025-12-04T13:21:31.4661317Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4661320Z 2025-12-04T13:21:31.4661321Z 2025-12-04T13:21:31.4661396Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4661485Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4661716Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d59e0d1f8082da7d.xml - 2025-12-04T13:21:31.4661777Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4662040Z FAILED [10.0158s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4662088Z Traceback (most recent call last): 2025-12-04T13:21:31.4662253Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4662293Z getattr(self, test_name)() 2025-12-04T13:21:31.4662454Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4662487Z fn() 2025-12-04T13:21:31.4662639Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4662678Z method(*args, **kwargs) 2025-12-04T13:21:31.4662831Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4662870Z method(*args, **kwargs) 2025-12-04T13:21:31.4663021Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4663067Z with policy(): 2025-12-04T13:21:31.4663220Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4663260Z raise RuntimeError(msg) 2025-12-04T13:21:31.4663627Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:21:31.4663630Z 2025-12-04T13:21:31.4663703Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4663952Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4663954Z 2025-12-04T13:21:31.4664042Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4664104Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4664176Z ====================== 1 failed, 18 deselected in 10.17s ======================= 2025-12-04T13:21:31.4664229Z Got exit code 1 2025-12-04T13:21:31.4664425Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.4664553Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.4664740Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6683d1b284d3f9c9.xml 2025-12-04T13:21:31.4664807Z ============================= test session starts ============================== 2025-12-04T13:21:31.4664920Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4664961Z cachedir: .pytest_cache 2025-12-04T13:21:31.4665121Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4665167Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4665208Z configfile: pytest.ini 2025-12-04T13:21:31.4665369Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4665444Z collecting ... collected 60 items / 12 deselected / 48 selected 2025-12-04T13:21:31.4665497Z stepcurrent: skipping 12 already run items. 2025-12-04T13:21:31.4665541Z Running 7 items in this shard 2025-12-04T13:21:31.4665544Z 2025-12-04T13:21:31.4665854Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda I1204 13:17:32.412000 558016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 558085 2025-12-04T13:21:31.4666009Z I1204 13:17:32.413000 558016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 558086 2025-12-04T13:21:31.4666162Z I1204 13:17:32.414000 558016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 558087 2025-12-04T13:21:31.4666312Z I1204 13:17:32.414000 558016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 558088 2025-12-04T13:21:31.4666888Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4666926Z _warn_cpu_init() 2025-12-04T13:21:31.4667228Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4667272Z return func(*args, **kwargs) 2025-12-04T13:21:31.4667842Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4667880Z _warn_cpu_init() 2025-12-04T13:21:31.4668490Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4668543Z _warn_cpu_init() 2025-12-04T13:21:31.4669105Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4669155Z _warn_cpu_init() 2025-12-04T13:21:31.4669300Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4669463Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4669751Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4669907Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4670192Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4670318Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4670595Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4670745Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4671021Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4671170Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4671456Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4671594Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4671871Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4672019Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4672507Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 22016 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4672634Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4672828Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4673210Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4673337Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4673549Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4673714Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4673753Z dist init r=0, world=4 2025-12-04T13:21:31.4673891Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4674049Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4674336Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4674492Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4674779Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4674905Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4675181Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4675329Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4675614Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4675761Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4676038Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4676174Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4676451Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4676600Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4677095Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:21:31.4677220Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4677414Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4677786Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4677899Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4678111Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4678346Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4678484Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4678642Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4678929Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4679085Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4679369Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4679494Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4679770Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4679934Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4680211Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4680358Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4680632Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4680768Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4681046Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4681214Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4681696Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:21:31.4681827Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4682034Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4682396Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4682509Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4682719Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4682881Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4682920Z dist init r=3, world=4 2025-12-04T13:21:31.4682957Z dist init r=1, world=4 2025-12-04T13:21:31.4683095Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4683253Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4683541Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4683701Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4683984Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4684108Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4684393Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4684541Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4684815Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4684962Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4685238Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4685384Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4685661Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4685819Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4686301Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:21:31.4686426Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4686621Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4686982Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4687096Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4687308Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4687471Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4687511Z dist init r=2, world=4 2025-12-04T13:21:31.4687846Z [rank0]:[W1204 13:17:40.514976621 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4687887Z FAILED [10.1169s] [ 14%] 2025-12-04T13:21:31.4687890Z 2025-12-04T13:21:31.4687946Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4688046Z __ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda __ 2025-12-04T13:21:31.4688093Z Traceback (most recent call last): 2025-12-04T13:21:31.4688292Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4688348Z self._join_processes(fn) 2025-12-04T13:21:31.4688522Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4688576Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4688756Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4688799Z raise RuntimeError(error) 2025-12-04T13:21:31.4688880Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4688925Z Traceback (most recent call last): 2025-12-04T13:21:31.4689088Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4689130Z getattr(self, test_name)() 2025-12-04T13:21:31.4689289Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4689323Z fn() 2025-12-04T13:21:31.4689487Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4689529Z method(*args, **kwargs) 2025-12-04T13:21:31.4689693Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4689733Z method(*args, **kwargs) 2025-12-04T13:21:31.4689883Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4689920Z with policy(): 2025-12-04T13:21:31.4690083Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4690124Z raise RuntimeError(msg) 2025-12-04T13:21:31.4690481Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 22016 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4690484Z 2025-12-04T13:21:31.4690559Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4690795Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4690797Z 2025-12-04T13:21:31.4690885Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4690888Z 2025-12-04T13:21:31.4690890Z 2025-12-04T13:21:31.4690964Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4691052Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4691285Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6683d1b284d3f9c9.xml - 2025-12-04T13:21:31.4691345Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4691597Z FAILED [10.1169s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4691645Z Traceback (most recent call last): 2025-12-04T13:21:31.4691809Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4691853Z getattr(self, test_name)() 2025-12-04T13:21:31.4692013Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4692048Z fn() 2025-12-04T13:21:31.4692209Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4692249Z method(*args, **kwargs) 2025-12-04T13:21:31.4692401Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4692441Z method(*args, **kwargs) 2025-12-04T13:21:31.4692590Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4692626Z with policy(): 2025-12-04T13:21:31.4692778Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4692820Z raise RuntimeError(msg) 2025-12-04T13:21:31.4693174Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 22016 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4693176Z 2025-12-04T13:21:31.4693251Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4693493Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4693506Z 2025-12-04T13:21:31.4693593Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4693656Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4693718Z ====================== 1 failed, 12 deselected in 10.28s ======================= 2025-12-04T13:21:31.4693765Z Got exit code 1 2025-12-04T13:21:31.4693805Z Retrying single test... 2025-12-04T13:21:31.4693994Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d19ef080ca548a7a.xml 2025-12-04T13:21:31.4694052Z ============================= test session starts ============================== 2025-12-04T13:21:31.4694164Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4694204Z cachedir: .pytest_cache 2025-12-04T13:21:31.4694365Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4694410Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4694450Z configfile: pytest.ini 2025-12-04T13:21:31.4694611Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4694688Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4694916Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4694960Z Running 1 items in this shard 2025-12-04T13:21:31.4694963Z 2025-12-04T13:21:31.4695271Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda I1204 13:17:45.344000 558418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 558487 2025-12-04T13:21:31.4695426Z I1204 13:17:45.345000 558418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 558488 2025-12-04T13:21:31.4695577Z I1204 13:17:45.345000 558418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 558489 2025-12-04T13:21:31.4695726Z I1204 13:17:45.346000 558418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 558490 2025-12-04T13:21:31.4696323Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4696361Z _warn_cpu_init() 2025-12-04T13:21:31.4696931Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4696969Z _warn_cpu_init() 2025-12-04T13:21:31.4697544Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4697591Z _warn_cpu_init() 2025-12-04T13:21:31.4698194Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4698247Z _warn_cpu_init() 2025-12-04T13:21:31.4698539Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4698582Z return func(*args, **kwargs) 2025-12-04T13:21:31.4698725Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4698886Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4699175Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4699333Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4699629Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4699754Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4700033Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4700182Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4700470Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4700620Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4700894Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4701032Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4701307Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4701456Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4701953Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:21:31.4702081Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4702276Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4702647Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4702762Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4702973Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4703138Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4703178Z dist init r=1, world=4 2025-12-04T13:21:31.4703315Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4703475Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4703762Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4703918Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4704203Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4704328Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4704604Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4704760Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4705036Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4705182Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4705457Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4705593Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4705871Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4706028Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4706523Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:21:31.4706648Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4706843Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4707205Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4707318Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4707529Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4707693Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4707732Z dist init r=3, world=4 2025-12-04T13:21:31.4707869Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4708029Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4708344Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4708498Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4708781Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4708904Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4709198Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4709346Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4709621Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4709768Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4710043Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4710190Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4710465Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4710626Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4711105Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:21:31.4711248Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4711445Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4711807Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4711922Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4712132Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4712298Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4712336Z dist init r=2, world=4 2025-12-04T13:21:31.4712474Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4712633Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4712919Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4713075Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4713370Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4713494Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4713770Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4713917Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4714193Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4714340Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4714626Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4714770Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4715047Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4715205Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4715686Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4715801Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4715996Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4716357Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4716470Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4716681Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4716844Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4716884Z dist init r=0, world=4 2025-12-04T13:21:31.4717221Z [rank0]:[W1204 13:17:54.714627307 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4717263Z FAILED [10.3163s] [100%] 2025-12-04T13:21:31.4717265Z 2025-12-04T13:21:31.4717321Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4717429Z __ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda __ 2025-12-04T13:21:31.4717477Z Traceback (most recent call last): 2025-12-04T13:21:31.4717640Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4717685Z self._join_processes(fn) 2025-12-04T13:21:31.4717857Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4717910Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4718088Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4718132Z raise RuntimeError(error) 2025-12-04T13:21:31.4718258Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4718305Z Traceback (most recent call last): 2025-12-04T13:21:31.4718466Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4718522Z getattr(self, test_name)() 2025-12-04T13:21:31.4718680Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4718727Z fn() 2025-12-04T13:21:31.4718878Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4718919Z method(*args, **kwargs) 2025-12-04T13:21:31.4719069Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4719124Z method(*args, **kwargs) 2025-12-04T13:21:31.4719275Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4719313Z with policy(): 2025-12-04T13:21:31.4719465Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4719507Z raise RuntimeError(msg) 2025-12-04T13:21:31.4719862Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:21:31.4719866Z 2025-12-04T13:21:31.4719941Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4720177Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4720179Z 2025-12-04T13:21:31.4720266Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4720269Z 2025-12-04T13:21:31.4720270Z 2025-12-04T13:21:31.4720345Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4720434Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4720669Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d19ef080ca548a7a.xml - 2025-12-04T13:21:31.4720728Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4720980Z FAILED [10.3163s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4721028Z Traceback (most recent call last): 2025-12-04T13:21:31.4721191Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4721247Z getattr(self, test_name)() 2025-12-04T13:21:31.4721407Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4721442Z fn() 2025-12-04T13:21:31.4721594Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4721636Z method(*args, **kwargs) 2025-12-04T13:21:31.4721787Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4721826Z method(*args, **kwargs) 2025-12-04T13:21:31.4721976Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4722013Z with policy(): 2025-12-04T13:21:31.4722166Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4722206Z raise RuntimeError(msg) 2025-12-04T13:21:31.4722574Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:21:31.4722587Z 2025-12-04T13:21:31.4722663Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4722895Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4722908Z 2025-12-04T13:21:31.4722997Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4723059Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4723122Z ====================== 1 failed, 18 deselected in 10.45s ======================= 2025-12-04T13:21:31.4723159Z Got exit code 1 2025-12-04T13:21:31.4723198Z Retrying single test... 2025-12-04T13:21:31.4723388Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-50f384147ce25093.xml 2025-12-04T13:21:31.4723446Z ============================= test session starts ============================== 2025-12-04T13:21:31.4723558Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4723598Z cachedir: .pytest_cache 2025-12-04T13:21:31.4723756Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4723802Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4723842Z configfile: pytest.ini 2025-12-04T13:21:31.4724004Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4724079Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4724307Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4724351Z Running 1 items in this shard 2025-12-04T13:21:31.4724353Z 2025-12-04T13:21:31.4724661Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda I1204 13:17:58.220000 558820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 558889 2025-12-04T13:21:31.4724816Z I1204 13:17:58.221000 558820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 558890 2025-12-04T13:21:31.4724968Z I1204 13:17:58.221000 558820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 558891 2025-12-04T13:21:31.4725129Z I1204 13:17:58.222000 558820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 558892 2025-12-04T13:21:31.4725710Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4725748Z _warn_cpu_init() 2025-12-04T13:21:31.4726316Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4726363Z _warn_cpu_init() 2025-12-04T13:21:31.4726655Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4726716Z return func(*args, **kwargs) 2025-12-04T13:21:31.4727283Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4727330Z _warn_cpu_init() 2025-12-04T13:21:31.4727898Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4727937Z _warn_cpu_init() 2025-12-04T13:21:31.4728081Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4728280Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4728570Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4728725Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4729010Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4729137Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4729415Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4729576Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4729852Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4730001Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4730276Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4730414Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4730692Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4730854Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4731347Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:21:31.4731475Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4731672Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4732035Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4732151Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4732361Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4732528Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4732568Z dist init r=1, world=4 2025-12-04T13:21:31.4732705Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4732867Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4733154Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4733309Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4733591Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4733728Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4734005Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4734153Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4734429Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4734576Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4734851Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4734997Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4735287Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4735434Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4735913Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 24064 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:21:31.4736042Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4736237Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4736598Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4736713Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4736928Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4737092Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4737131Z dist init r=2, world=4 2025-12-04T13:21:31.4737270Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4737429Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4737715Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4737870Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4738196Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4738320Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4738596Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4738743Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4739020Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4739168Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4739466Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4739614Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4739890Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4740051Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4740536Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4740650Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4740846Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4741205Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4741321Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4741532Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4741696Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4741734Z dist init r=0, world=4 2025-12-04T13:21:31.4741872Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4742031Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4742329Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4742484Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4742767Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4742891Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4743167Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4743316Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4743606Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4743763Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4744037Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4744182Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4744459Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4744608Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4745085Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:21:31.4745201Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4745394Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4745759Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4745873Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4746083Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4746246Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4746285Z dist init r=3, world=4 2025-12-04T13:21:31.4746629Z [rank0]:[W1204 13:18:06.217449301 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4746671Z FAILED [10.0148s] [100%] 2025-12-04T13:21:31.4746673Z 2025-12-04T13:21:31.4746730Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4746830Z __ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda __ 2025-12-04T13:21:31.4746877Z Traceback (most recent call last): 2025-12-04T13:21:31.4747037Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4747083Z self._join_processes(fn) 2025-12-04T13:21:31.4747254Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4747308Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4747485Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4747539Z raise RuntimeError(error) 2025-12-04T13:21:31.4747619Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4747675Z Traceback (most recent call last): 2025-12-04T13:21:31.4747835Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4747877Z getattr(self, test_name)() 2025-12-04T13:21:31.4748034Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4748079Z fn() 2025-12-04T13:21:31.4748267Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4748308Z method(*args, **kwargs) 2025-12-04T13:21:31.4748461Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4748501Z method(*args, **kwargs) 2025-12-04T13:21:31.4748652Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4748690Z with policy(): 2025-12-04T13:21:31.4748841Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4748882Z raise RuntimeError(msg) 2025-12-04T13:21:31.4749240Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:21:31.4749245Z 2025-12-04T13:21:31.4749319Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4749555Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4749557Z 2025-12-04T13:21:31.4749645Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4749648Z 2025-12-04T13:21:31.4749649Z 2025-12-04T13:21:31.4749724Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4749811Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4750043Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-50f384147ce25093.xml - 2025-12-04T13:21:31.4750103Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4750368Z FAILED [10.0148s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4750416Z Traceback (most recent call last): 2025-12-04T13:21:31.4750579Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4750622Z getattr(self, test_name)() 2025-12-04T13:21:31.4750780Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4750815Z fn() 2025-12-04T13:21:31.4750965Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4751007Z method(*args, **kwargs) 2025-12-04T13:21:31.4751158Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4751197Z method(*args, **kwargs) 2025-12-04T13:21:31.4751348Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4751404Z with policy(): 2025-12-04T13:21:31.4751556Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4751616Z raise RuntimeError(msg) 2025-12-04T13:21:31.4751971Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:21:31.4751993Z 2025-12-04T13:21:31.4752067Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4752302Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4752304Z 2025-12-04T13:21:31.4752391Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4752454Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4752517Z ====================== 1 failed, 18 deselected in 10.15s ======================= 2025-12-04T13:21:31.4752554Z Got exit code 1 2025-12-04T13:21:31.4752736Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda 2025-12-04T13:21:31.4752865Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.4753054Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d75a93e73e18887b.xml 2025-12-04T13:21:31.4753112Z ============================= test session starts ============================== 2025-12-04T13:21:31.4753223Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4753266Z cachedir: .pytest_cache 2025-12-04T13:21:31.4753424Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4753473Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4753512Z configfile: pytest.ini 2025-12-04T13:21:31.4753675Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4753749Z collecting ... collected 60 items / 13 deselected / 47 selected 2025-12-04T13:21:31.4753803Z stepcurrent: skipping 13 already run items. 2025-12-04T13:21:31.4753846Z Running 6 items in this shard 2025-12-04T13:21:31.4753849Z 2025-12-04T13:21:31.4754177Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda I1204 13:18:10.698000 559222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 559291 2025-12-04T13:21:31.4754334Z I1204 13:18:10.699000 559222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 559292 2025-12-04T13:21:31.4754486Z I1204 13:18:10.700000 559222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 559293 2025-12-04T13:21:31.4754636Z I1204 13:18:10.700000 559222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 559294 2025-12-04T13:21:31.4755215Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4755254Z _warn_cpu_init() 2025-12-04T13:21:31.4755835Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4755881Z _warn_cpu_init() 2025-12-04T13:21:31.4756456Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4756493Z _warn_cpu_init() 2025-12-04T13:21:31.4757056Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4757095Z _warn_cpu_init() 2025-12-04T13:21:31.4757385Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4757429Z return func(*args, **kwargs) 2025-12-04T13:21:31.4757572Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4757734Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4758023Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4758207Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4758517Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4758643Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4758921Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4759069Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4759346Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4759493Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4759782Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4759918Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4760207Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4760355Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4760859Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:21:31.4760976Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4761171Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4761541Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4761657Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4761870Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4762035Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4762074Z dist init r=2, world=4 2025-12-04T13:21:31.4762212Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4762370Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4762657Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4762822Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4763109Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4763233Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4763509Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4763657Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4763933Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4764089Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4764373Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4764510Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4764795Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4764945Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4765435Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4765551Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4765747Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4766114Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4766231Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4766442Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4766607Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4766647Z dist init r=0, world=4 2025-12-04T13:21:31.4766784Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4766954Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4767240Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4767395Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4767679Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4767803Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4768127Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4768328Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4768604Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4768763Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4769037Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4769185Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4769463Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4769612Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4770097Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:21:31.4770213Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4770409Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4770776Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4770893Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4771105Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4771269Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4771321Z dist init r=3, world=4 2025-12-04T13:21:31.4771458Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4771617Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4771904Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4772057Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4772343Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4772467Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4772752Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4772911Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4773188Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4773354Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4773631Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4773767Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4774044Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4774193Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4774677Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 23040 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:21:31.4774793Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4774988Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4775352Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4775469Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4775688Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4775853Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4775892Z dist init r=1, world=4 2025-12-04T13:21:31.4776229Z [rank0]:[W1204 13:18:19.727574385 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4776270Z FAILED [10.0155s] [ 16%] 2025-12-04T13:21:31.4776273Z 2025-12-04T13:21:31.4776328Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4776438Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda _ 2025-12-04T13:21:31.4776485Z Traceback (most recent call last): 2025-12-04T13:21:31.4776647Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4776699Z self._join_processes(fn) 2025-12-04T13:21:31.4776871Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4776935Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4777113Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4777156Z raise RuntimeError(error) 2025-12-04T13:21:31.4777249Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4777294Z Traceback (most recent call last): 2025-12-04T13:21:31.4777456Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4777498Z getattr(self, test_name)() 2025-12-04T13:21:31.4777659Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4777693Z fn() 2025-12-04T13:21:31.4777845Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4777885Z method(*args, **kwargs) 2025-12-04T13:21:31.4778038Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4778076Z method(*args, **kwargs) 2025-12-04T13:21:31.4778274Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4778311Z with policy(): 2025-12-04T13:21:31.4778463Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4778505Z raise RuntimeError(msg) 2025-12-04T13:21:31.4778868Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:21:31.4778872Z 2025-12-04T13:21:31.4778947Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4779191Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4779194Z 2025-12-04T13:21:31.4779283Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4779285Z 2025-12-04T13:21:31.4779287Z 2025-12-04T13:21:31.4779377Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4779466Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4779697Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d75a93e73e18887b.xml - 2025-12-04T13:21:31.4779759Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4780016Z FAILED [10.0155s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.4780062Z Traceback (most recent call last): 2025-12-04T13:21:31.4780226Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4780268Z getattr(self, test_name)() 2025-12-04T13:21:31.4780427Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4780461Z fn() 2025-12-04T13:21:31.4780624Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4780675Z method(*args, **kwargs) 2025-12-04T13:21:31.4780826Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4780865Z method(*args, **kwargs) 2025-12-04T13:21:31.4781015Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4781066Z with policy(): 2025-12-04T13:21:31.4781218Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4781258Z raise RuntimeError(msg) 2025-12-04T13:21:31.4781626Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:21:31.4781629Z 2025-12-04T13:21:31.4781705Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4781944Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4781946Z 2025-12-04T13:21:31.4782036Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4782100Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4782163Z ====================== 1 failed, 13 deselected in 10.15s ======================= 2025-12-04T13:21:31.4782200Z Got exit code 1 2025-12-04T13:21:31.4782243Z Retrying single test... 2025-12-04T13:21:31.4782432Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-35564a50697736ba.xml 2025-12-04T13:21:31.4782490Z ============================= test session starts ============================== 2025-12-04T13:21:31.4782602Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4782644Z cachedir: .pytest_cache 2025-12-04T13:21:31.4782801Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4782848Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4782888Z configfile: pytest.ini 2025-12-04T13:21:31.4783050Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4783124Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4783370Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4783415Z Running 1 items in this shard 2025-12-04T13:21:31.4783418Z 2025-12-04T13:21:31.4783734Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda I1204 13:18:23.231000 559624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 559693 2025-12-04T13:21:31.4783889Z I1204 13:18:23.232000 559624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 559694 2025-12-04T13:21:31.4784041Z I1204 13:18:23.233000 559624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 559695 2025-12-04T13:21:31.4784192Z I1204 13:18:23.233000 559624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 559696 2025-12-04T13:21:31.4784781Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4784828Z _warn_cpu_init() 2025-12-04T13:21:31.4785397Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4785538Z _warn_cpu_init() 2025-12-04T13:21:31.4788024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4788067Z _warn_cpu_init() 2025-12-04T13:21:31.4788698Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4788735Z _warn_cpu_init() 2025-12-04T13:21:31.4789030Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4789074Z return func(*args, **kwargs) 2025-12-04T13:21:31.4789219Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4789381Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4789709Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4789867Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4790152Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4790278Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4790556Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4790706Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4790998Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4791159Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4791436Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4791573Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4791865Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4792013Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4792504Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:21:31.4792621Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4792818Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4793189Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4793304Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4793518Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4793683Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4793723Z dist init r=1, world=4 2025-12-04T13:21:31.4793861Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4794030Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4794318Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4794472Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4794757Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4794881Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4795158Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4795315Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4795601Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4795749Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4796036Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4796174Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4796451Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4796600Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4797086Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4797203Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4797399Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4797770Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4797885Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4798097Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4798314Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4798353Z dist init r=0, world=4 2025-12-04T13:21:31.4798491Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4798652Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4798939Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4799093Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4799379Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4799515Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4799790Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4799953Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4800228Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4800388Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4800663Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4800800Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4801079Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4801228Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4801714Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:21:31.4801828Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4802023Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4802391Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4802514Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4802728Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4802892Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4802932Z dist init r=3, world=4 2025-12-04T13:21:31.4803068Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4803227Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4803514Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4803668Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4803965Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4804102Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4804379Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4804550Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4804829Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4804975Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4805251Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4805386Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4805665Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4805814Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4806301Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:21:31.4806417Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4806612Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4807003Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4807118Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4807329Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4807493Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4807531Z dist init r=2, world=4 2025-12-04T13:21:31.4807869Z [rank0]:[W1204 13:18:31.085636797 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4807908Z FAILED [9.8154s] [100%] 2025-12-04T13:21:31.4807912Z 2025-12-04T13:21:31.4807981Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4808100Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda _ 2025-12-04T13:21:31.4808174Z Traceback (most recent call last): 2025-12-04T13:21:31.4808339Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4808383Z self._join_processes(fn) 2025-12-04T13:21:31.4808554Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4808626Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4808804Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4808848Z raise RuntimeError(error) 2025-12-04T13:21:31.4808929Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4808974Z Traceback (most recent call last): 2025-12-04T13:21:31.4809136Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4809178Z getattr(self, test_name)() 2025-12-04T13:21:31.4809335Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4809369Z fn() 2025-12-04T13:21:31.4809523Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4809563Z method(*args, **kwargs) 2025-12-04T13:21:31.4809714Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4809753Z method(*args, **kwargs) 2025-12-04T13:21:31.4809903Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4809940Z with policy(): 2025-12-04T13:21:31.4810093Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4810132Z raise RuntimeError(msg) 2025-12-04T13:21:31.4810494Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4810498Z 2025-12-04T13:21:31.4810573Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4810829Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4810832Z 2025-12-04T13:21:31.4810921Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4810924Z 2025-12-04T13:21:31.4810926Z 2025-12-04T13:21:31.4811002Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4811089Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4811320Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-35564a50697736ba.xml - 2025-12-04T13:21:31.4811381Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4811638Z FAILED [9.8154s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4811684Z Traceback (most recent call last): 2025-12-04T13:21:31.4811862Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4811917Z getattr(self, test_name)() 2025-12-04T13:21:31.4812076Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4812111Z fn() 2025-12-04T13:21:31.4812262Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4812312Z method(*args, **kwargs) 2025-12-04T13:21:31.4812463Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4812502Z method(*args, **kwargs) 2025-12-04T13:21:31.4812653Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4812690Z with policy(): 2025-12-04T13:21:31.4812844Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4812885Z raise RuntimeError(msg) 2025-12-04T13:21:31.4813244Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4813247Z 2025-12-04T13:21:31.4813321Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4813562Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4813565Z 2025-12-04T13:21:31.4813651Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4813716Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4813778Z ======================= 1 failed, 18 deselected in 9.95s ======================= 2025-12-04T13:21:31.4813816Z Got exit code 1 2025-12-04T13:21:31.4813855Z Retrying single test... 2025-12-04T13:21:31.4814043Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-168828f8a7ed70a3.xml 2025-12-04T13:21:31.4814101Z ============================= test session starts ============================== 2025-12-04T13:21:31.4814215Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4814256Z cachedir: .pytest_cache 2025-12-04T13:21:31.4814427Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4814474Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4814513Z configfile: pytest.ini 2025-12-04T13:21:31.4814679Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4814753Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4814989Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4815032Z Running 1 items in this shard 2025-12-04T13:21:31.4815035Z 2025-12-04T13:21:31.4815352Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda I1204 13:18:35.655000 560026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 560095 2025-12-04T13:21:31.4815508Z I1204 13:18:35.656000 560026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 560096 2025-12-04T13:21:31.4815673Z I1204 13:18:35.657000 560026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 560097 2025-12-04T13:21:31.4815835Z I1204 13:18:35.657000 560026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 560098 2025-12-04T13:21:31.4816414Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4816461Z _warn_cpu_init() 2025-12-04T13:21:31.4817031Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4817070Z _warn_cpu_init() 2025-12-04T13:21:31.4817634Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4817673Z _warn_cpu_init() 2025-12-04T13:21:31.4818276Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4818313Z _warn_cpu_init() 2025-12-04T13:21:31.4818604Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4818647Z return func(*args, **kwargs) 2025-12-04T13:21:31.4818805Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4818968Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4819257Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4819413Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4819700Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4819827Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4820116Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4820286Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4820562Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4820722Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4820998Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4821135Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4821412Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4821562Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4822053Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4822168Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4822365Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4822735Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4822851Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4823062Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4823237Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4823277Z dist init r=0, world=4 2025-12-04T13:21:31.4823414Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4823574Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4823860Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4824015Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4824319Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4824444Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4824731Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4824878Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4825165Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4825311Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4825588Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4825725Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4826002Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4826152Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4826642Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:21:31.4826758Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4826952Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4827321Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4827447Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4827658Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4827823Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4827860Z dist init r=3, world=4 2025-12-04T13:21:31.4827998Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4828201Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4828488Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4828656Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4828953Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4829077Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4829356Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4829520Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4829796Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4829944Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4830219Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4830357Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4830634Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4830781Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4831266Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:21:31.4831381Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4831576Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4831960Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4832075Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4832286Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4832448Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4832487Z dist init r=1, world=4 2025-12-04T13:21:31.4832626Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4832786Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4833083Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4833247Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4833531Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4833668Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4833945Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4834092Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4834368Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4834515Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4834793Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4834929Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4835206Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4835354Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4835836Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:21:31.4835962Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4836159Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4836526Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4836639Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4836850Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4837015Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4837053Z dist init r=2, world=4 2025-12-04T13:21:31.4837409Z [rank0]:[W1204 13:18:43.461412588 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4837458Z FAILED [9.8141s] [100%] 2025-12-04T13:21:31.4837460Z 2025-12-04T13:21:31.4837517Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4837637Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda _ 2025-12-04T13:21:31.4837684Z Traceback (most recent call last): 2025-12-04T13:21:31.4837846Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4837890Z self._join_processes(fn) 2025-12-04T13:21:31.4838063Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4838117Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4838340Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4838382Z raise RuntimeError(error) 2025-12-04T13:21:31.4838463Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4838507Z Traceback (most recent call last): 2025-12-04T13:21:31.4838669Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4838710Z getattr(self, test_name)() 2025-12-04T13:21:31.4838869Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4838902Z fn() 2025-12-04T13:21:31.4839055Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4839095Z method(*args, **kwargs) 2025-12-04T13:21:31.4839247Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4839286Z method(*args, **kwargs) 2025-12-04T13:21:31.4839436Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4839472Z with policy(): 2025-12-04T13:21:31.4839625Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4839664Z raise RuntimeError(msg) 2025-12-04T13:21:31.4840043Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4840046Z 2025-12-04T13:21:31.4840121Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4840365Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4840368Z 2025-12-04T13:21:31.4840457Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4840460Z 2025-12-04T13:21:31.4840521Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4840567Z Traceback (most recent call last): 2025-12-04T13:21:31.4840729Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4840772Z getattr(self, test_name)() 2025-12-04T13:21:31.4840944Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4840979Z fn() 2025-12-04T13:21:31.4841142Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4841181Z method(*args, **kwargs) 2025-12-04T13:21:31.4841331Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4841372Z method(*args, **kwargs) 2025-12-04T13:21:31.4841536Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4841572Z with policy(): 2025-12-04T13:21:31.4841723Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4841765Z raise RuntimeError(msg) 2025-12-04T13:21:31.4842123Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:21:31.4842128Z 2025-12-04T13:21:31.4842201Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4842442Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4842445Z 2025-12-04T13:21:31.4842532Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4842534Z 2025-12-04T13:21:31.4842536Z 2025-12-04T13:21:31.4842613Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4842700Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4842933Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-168828f8a7ed70a3.xml - 2025-12-04T13:21:31.4842995Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4843251Z FAILED [9.8141s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4843296Z Traceback (most recent call last): 2025-12-04T13:21:31.4843461Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4843502Z getattr(self, test_name)() 2025-12-04T13:21:31.4843671Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4843705Z fn() 2025-12-04T13:21:31.4843857Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4843898Z method(*args, **kwargs) 2025-12-04T13:21:31.4844049Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4844088Z method(*args, **kwargs) 2025-12-04T13:21:31.4844238Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4844275Z with policy(): 2025-12-04T13:21:31.4844425Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4844465Z raise RuntimeError(msg) 2025-12-04T13:21:31.4844835Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:21:31.4844839Z 2025-12-04T13:21:31.4844921Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4845163Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4845165Z 2025-12-04T13:21:31.4845252Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4845268Z 2025-12-04T13:21:31.4845327Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4845371Z Traceback (most recent call last): 2025-12-04T13:21:31.4845534Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4845575Z getattr(self, test_name)() 2025-12-04T13:21:31.4845734Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4845768Z fn() 2025-12-04T13:21:31.4845919Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4845959Z method(*args, **kwargs) 2025-12-04T13:21:31.4846108Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4846146Z method(*args, **kwargs) 2025-12-04T13:21:31.4846296Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4846332Z with policy(): 2025-12-04T13:21:31.4846484Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4846524Z raise RuntimeError(msg) 2025-12-04T13:21:31.4846883Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:21:31.4846886Z 2025-12-04T13:21:31.4846959Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4847201Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4847205Z 2025-12-04T13:21:31.4847292Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4847356Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4847429Z ======================= 1 failed, 18 deselected in 9.95s ======================= 2025-12-04T13:21:31.4847466Z Got exit code 1 2025-12-04T13:21:31.4847657Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:21:31.4847787Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.4847976Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a4b1e12efc4a33d8.xml 2025-12-04T13:21:31.4848035Z ============================= test session starts ============================== 2025-12-04T13:21:31.4848190Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4848231Z cachedir: .pytest_cache 2025-12-04T13:21:31.4848389Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4848436Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4848476Z configfile: pytest.ini 2025-12-04T13:21:31.4848652Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4848738Z collecting ... collected 60 items / 14 deselected / 46 selected 2025-12-04T13:21:31.4848791Z stepcurrent: skipping 14 already run items. 2025-12-04T13:21:31.4848835Z Running 5 items in this shard 2025-12-04T13:21:31.4848837Z 2025-12-04T13:21:31.4849193Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda I1204 13:18:48.110000 560428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 560497 2025-12-04T13:21:31.4849361Z I1204 13:18:48.110000 560428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 560498 2025-12-04T13:21:31.4849513Z I1204 13:18:48.111000 560428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 560499 2025-12-04T13:21:31.4849666Z I1204 13:18:48.112000 560428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 560500 2025-12-04T13:21:31.4849960Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4850013Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4850591Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4850630Z _warn_cpu_init() 2025-12-04T13:21:31.4850920Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4851000Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4851285Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4851337Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4851930Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4851968Z _warn_cpu_init() 2025-12-04T13:21:31.4852255Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4852331Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4852618Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4852666Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4853246Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4853292Z _warn_cpu_init() 2025-12-04T13:21:31.4853577Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4853662Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4853946Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4853996Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4854566Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4854604Z _warn_cpu_init() 2025-12-04T13:21:31.4854890Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4854963Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4855193Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4855235Z return func(*args, **kwargs) 2025-12-04T13:21:31.4855459Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4855500Z return func(*args, **kwargs) 2025-12-04T13:21:31.4855722Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4855763Z return func(*args, **kwargs) 2025-12-04T13:21:31.4855994Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4856034Z return func(*args, **kwargs) 2025-12-04T13:21:31.4856254Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4856296Z return func(*args, **kwargs) 2025-12-04T13:21:31.4856514Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4856554Z return func(*args, **kwargs) 2025-12-04T13:21:31.4856775Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4856815Z return func(*args, **kwargs) 2025-12-04T13:21:31.4857034Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4857085Z return func(*args, **kwargs) 2025-12-04T13:21:31.4857377Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4857429Z return func(*args, **kwargs) 2025-12-04T13:21:31.4857573Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4857748Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4858038Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4858230Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4858517Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4858642Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4858921Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4859073Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4859351Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4859498Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4859775Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4859913Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4860204Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4860353Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4860885Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T13:21:31.4861002Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4861199Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4861626Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4861757Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4861967Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4862144Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4862182Z dist init r=3, world=4 2025-12-04T13:21:31.4862321Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4862482Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4862770Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4862924Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4863208Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4863334Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4863610Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4863759Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4864035Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4864185Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4864474Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4864613Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4864891Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4865038Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4865567Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T13:21:31.4865692Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4865888Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4866308Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4866438Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4866653Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4866819Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4866957Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4867115Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4867402Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4867556Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4867842Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4867965Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4868274Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4868422Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4868709Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4868857Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4869134Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4869272Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4869549Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4869697Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4870238Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 162304 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472. 2025-12-04T13:21:31.4870363Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4870558Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4870976Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4871091Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4871302Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4871467Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4871507Z dist init r=0, world=4 2025-12-04T13:21:31.4871544Z dist init r=1, world=4 2025-12-04T13:21:31.4871682Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4871842Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4872132Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4872285Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4872569Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4872695Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4872982Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4873130Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4873405Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4873554Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4873827Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4873964Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4874253Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4874410Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4874934Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256. 2025-12-04T13:21:31.4875058Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4875255Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4875663Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4875777Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4875988Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4876154Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4876193Z dist init r=2, world=4 2025-12-04T13:21:31.4876528Z [rank0]:[W1204 13:18:54.959531404 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4876569Z FAILED [7.7129s] [ 20%] 2025-12-04T13:21:31.4876571Z 2025-12-04T13:21:31.4876627Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4876773Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda _ 2025-12-04T13:21:31.4876820Z Traceback (most recent call last): 2025-12-04T13:21:31.4876985Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4877027Z self._join_processes(fn) 2025-12-04T13:21:31.4877209Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4877265Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4877441Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4877486Z raise RuntimeError(error) 2025-12-04T13:21:31.4877566Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4877611Z Traceback (most recent call last): 2025-12-04T13:21:31.4877771Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4877816Z getattr(self, test_name)() 2025-12-04T13:21:31.4877973Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4878007Z fn() 2025-12-04T13:21:31.4878195Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4878250Z method(*args, **kwargs) 2025-12-04T13:21:31.4878401Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4878454Z method(*args, **kwargs) 2025-12-04T13:21:31.4878603Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4878640Z with policy(): 2025-12-04T13:21:31.4878791Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4878845Z raise RuntimeError(msg) 2025-12-04T13:21:31.4879249Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T13:21:31.4879252Z 2025-12-04T13:21:31.4879328Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4879611Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4879613Z 2025-12-04T13:21:31.4879702Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4879705Z 2025-12-04T13:21:31.4879764Z Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4879809Z Traceback (most recent call last): 2025-12-04T13:21:31.4879973Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4880016Z getattr(self, test_name)() 2025-12-04T13:21:31.4880176Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4880209Z fn() 2025-12-04T13:21:31.4880361Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4880401Z method(*args, **kwargs) 2025-12-04T13:21:31.4880552Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4880590Z method(*args, **kwargs) 2025-12-04T13:21:31.4880740Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4880777Z with policy(): 2025-12-04T13:21:31.4880929Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4880981Z raise RuntimeError(msg) 2025-12-04T13:21:31.4881385Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 162304 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472. 2025-12-04T13:21:31.4881388Z 2025-12-04T13:21:31.4881463Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4881742Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4881745Z 2025-12-04T13:21:31.4881832Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4881834Z 2025-12-04T13:21:31.4881836Z 2025-12-04T13:21:31.4881912Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4882000Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4882246Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a4b1e12efc4a33d8.xml - 2025-12-04T13:21:31.4882324Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4882621Z FAILED [7.7129s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4882678Z Traceback (most recent call last): 2025-12-04T13:21:31.4882841Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4882882Z getattr(self, test_name)() 2025-12-04T13:21:31.4883043Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4883077Z fn() 2025-12-04T13:21:31.4883229Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4883269Z method(*args, **kwargs) 2025-12-04T13:21:31.4883419Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4883457Z method(*args, **kwargs) 2025-12-04T13:21:31.4883607Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4883644Z with policy(): 2025-12-04T13:21:31.4883796Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4883835Z raise RuntimeError(msg) 2025-12-04T13:21:31.4884238Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T13:21:31.4884241Z 2025-12-04T13:21:31.4884314Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4884594Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4884597Z 2025-12-04T13:21:31.4884684Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4884687Z 2025-12-04T13:21:31.4884745Z Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.4884790Z Traceback (most recent call last): 2025-12-04T13:21:31.4886156Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4886201Z getattr(self, test_name)() 2025-12-04T13:21:31.4886362Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4886397Z fn() 2025-12-04T13:21:31.4886547Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4886586Z method(*args, **kwargs) 2025-12-04T13:21:31.4886735Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4886775Z method(*args, **kwargs) 2025-12-04T13:21:31.4886923Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4886959Z with policy(): 2025-12-04T13:21:31.4887111Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4887166Z raise RuntimeError(msg) 2025-12-04T13:21:31.4887566Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 162304 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472. 2025-12-04T13:21:31.4887583Z 2025-12-04T13:21:31.4887655Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4887942Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4887945Z 2025-12-04T13:21:31.4888033Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4888097Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4888198Z ======================= 1 failed, 14 deselected in 7.88s ======================= 2025-12-04T13:21:31.4888236Z Got exit code 1 2025-12-04T13:21:31.4888276Z Retrying single test... 2025-12-04T13:21:31.4888465Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6d3b6c109b160d41.xml 2025-12-04T13:21:31.4888523Z ============================= test session starts ============================== 2025-12-04T13:21:31.4888636Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4888677Z cachedir: .pytest_cache 2025-12-04T13:21:31.4888836Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4888882Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4888922Z configfile: pytest.ini 2025-12-04T13:21:31.4889086Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4889161Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4889437Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4889480Z Running 1 items in this shard 2025-12-04T13:21:31.4889482Z 2025-12-04T13:21:31.4889837Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda I1204 13:18:58.464000 560830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 560899 2025-12-04T13:21:31.4890013Z I1204 13:18:58.465000 560830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 560900 2025-12-04T13:21:31.4890166Z I1204 13:18:58.466000 560830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 560901 2025-12-04T13:21:31.4890315Z I1204 13:18:58.467000 560830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 560902 2025-12-04T13:21:31.4890607Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4890659Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4891255Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4891293Z _warn_cpu_init() 2025-12-04T13:21:31.4891593Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4891642Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4892212Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4892264Z _warn_cpu_init() 2025-12-04T13:21:31.4892554Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4892631Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4892917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4892992Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4893277Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4893326Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4893897Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4893935Z _warn_cpu_init() 2025-12-04T13:21:31.4894221Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4894305Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4894588Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4894638Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4895205Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4895242Z _warn_cpu_init() 2025-12-04T13:21:31.4895530Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4895615Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4895844Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4895897Z return func(*args, **kwargs) 2025-12-04T13:21:31.4896120Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4896172Z return func(*args, **kwargs) 2025-12-04T13:21:31.4896394Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4896434Z return func(*args, **kwargs) 2025-12-04T13:21:31.4896657Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4896696Z return func(*args, **kwargs) 2025-12-04T13:21:31.4896917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4896956Z return func(*args, **kwargs) 2025-12-04T13:21:31.4897175Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4897215Z return func(*args, **kwargs) 2025-12-04T13:21:31.4897434Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4897476Z return func(*args, **kwargs) 2025-12-04T13:21:31.4897697Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4897738Z return func(*args, **kwargs) 2025-12-04T13:21:31.4898029Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4898070Z return func(*args, **kwargs) 2025-12-04T13:21:31.4898268Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4898431Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4898736Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4898892Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4899177Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4899302Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4899582Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4899730Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4900081Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4900242Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4900520Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4900674Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4900953Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4901102Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4901632Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T13:21:31.4901749Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4901945Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4902356Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4902474Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4902684Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4902850Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4902899Z dist init r=0, world=4 2025-12-04T13:21:31.4903039Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4903199Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4903486Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4903639Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4903924Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4904049Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4904335Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4904493Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4904767Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4904924Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4905198Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4905337Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4905620Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4905769Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4906298Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472. 2025-12-04T13:21:31.4906413Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4906610Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4907017Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4907135Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4907359Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4907524Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4907563Z dist init r=1, world=4 2025-12-04T13:21:31.4907700Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4907861Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4908195Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4908351Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4908649Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4908785Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4909061Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4909221Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4909497Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4909644Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4909919Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4910054Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4910332Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4910484Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4911011Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256. 2025-12-04T13:21:31.4911127Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4911322Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4911742Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4911860Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4912072Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4912236Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4912275Z dist init r=2, world=4 2025-12-04T13:21:31.4912412Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4912571Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4912868Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4913030Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4913318Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4913452Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4913729Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4913877Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4914155Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4914302Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4914576Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4914715Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4914992Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4915141Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4915667Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 152064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T13:21:31.4915801Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4915998Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4916405Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4916522Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4916735Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4916899Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4916939Z dist init r=3, world=4 2025-12-04T13:21:31.4917284Z [rank0]:[W1204 13:19:04.272107138 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4917333Z FAILED [7.5126s] [100%] 2025-12-04T13:21:31.4917335Z 2025-12-04T13:21:31.4917392Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4917537Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda _ 2025-12-04T13:21:31.4917594Z Traceback (most recent call last): 2025-12-04T13:21:31.4917758Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4917801Z self._join_processes(fn) 2025-12-04T13:21:31.4917975Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4918028Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4918259Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4918303Z raise RuntimeError(error) 2025-12-04T13:21:31.4918384Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4918429Z Traceback (most recent call last): 2025-12-04T13:21:31.4918591Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4918633Z getattr(self, test_name)() 2025-12-04T13:21:31.4918791Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4918826Z fn() 2025-12-04T13:21:31.4918978Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4919019Z method(*args, **kwargs) 2025-12-04T13:21:31.4919171Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4919211Z method(*args, **kwargs) 2025-12-04T13:21:31.4919361Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4919398Z with policy(): 2025-12-04T13:21:31.4919550Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4919591Z raise RuntimeError(msg) 2025-12-04T13:21:31.4920007Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T13:21:31.4920010Z 2025-12-04T13:21:31.4920087Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4920368Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4920371Z 2025-12-04T13:21:31.4920459Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4920462Z 2025-12-04T13:21:31.4920464Z 2025-12-04T13:21:31.4920540Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4920627Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4920874Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6d3b6c109b160d41.xml - 2025-12-04T13:21:31.4920935Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4921243Z FAILED [7.5126s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4921289Z Traceback (most recent call last): 2025-12-04T13:21:31.4921453Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4921695Z getattr(self, test_name)() 2025-12-04T13:21:31.4921857Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4921891Z fn() 2025-12-04T13:21:31.4922043Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4922084Z method(*args, **kwargs) 2025-12-04T13:21:31.4922234Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4922274Z method(*args, **kwargs) 2025-12-04T13:21:31.4922423Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4922461Z with policy(): 2025-12-04T13:21:31.4922611Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4922652Z raise RuntimeError(msg) 2025-12-04T13:21:31.4923054Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T13:21:31.4923058Z 2025-12-04T13:21:31.4923132Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4923413Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4923415Z 2025-12-04T13:21:31.4923503Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4923566Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4923628Z ======================= 1 failed, 18 deselected in 7.65s ======================= 2025-12-04T13:21:31.4923665Z Got exit code 1 2025-12-04T13:21:31.4923705Z Retrying single test... 2025-12-04T13:21:31.4923904Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7de742e840eb07e8.xml 2025-12-04T13:21:31.4923962Z ============================= test session starts ============================== 2025-12-04T13:21:31.4924076Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4924116Z cachedir: .pytest_cache 2025-12-04T13:21:31.4924274Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4924319Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4924359Z configfile: pytest.ini 2025-12-04T13:21:31.4924523Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4924598Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4924883Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4924928Z Running 1 items in this shard 2025-12-04T13:21:31.4924940Z 2025-12-04T13:21:31.4925293Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda I1204 13:19:08.677000 561232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 561301 2025-12-04T13:21:31.4925449Z I1204 13:19:08.677000 561232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 561302 2025-12-04T13:21:31.4925611Z I1204 13:19:08.678000 561232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 561303 2025-12-04T13:21:31.4925762Z I1204 13:19:08.679000 561232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 561304 2025-12-04T13:21:31.4926055Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4926106Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4926687Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4926725Z _warn_cpu_init() 2025-12-04T13:21:31.4927016Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4927065Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4927637Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4927677Z _warn_cpu_init() 2025-12-04T13:21:31.4927964Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4928051Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4928376Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4928453Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4928737Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4928787Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4929379Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4929428Z _warn_cpu_init() 2025-12-04T13:21:31.4929714Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4929788Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4930071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4930136Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:21:31.4930706Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4930744Z _warn_cpu_init() 2025-12-04T13:21:31.4931027Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:21:31.4931102Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:21:31.4931331Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4931374Z return func(*args, **kwargs) 2025-12-04T13:21:31.4931599Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4931643Z return func(*args, **kwargs) 2025-12-04T13:21:31.4931866Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4931907Z return func(*args, **kwargs) 2025-12-04T13:21:31.4932128Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:21:31.4932170Z return func(*args, **kwargs) 2025-12-04T13:21:31.4932403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4932443Z return func(*args, **kwargs) 2025-12-04T13:21:31.4932664Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4932704Z return func(*args, **kwargs) 2025-12-04T13:21:31.4932924Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4932963Z return func(*args, **kwargs) 2025-12-04T13:21:31.4933183Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:21:31.4933223Z return func(*args, **kwargs) 2025-12-04T13:21:31.4933527Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4933567Z return func(*args, **kwargs) 2025-12-04T13:21:31.4933721Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4933883Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4934175Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4934341Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4934627Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4934753Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4935032Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4935182Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4935460Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4935608Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4935887Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4936024Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4936304Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4936454Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4936997Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 154112 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T13:21:31.4937114Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4937308Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4937719Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4937849Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4938063Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4938288Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4938328Z dist init r=0, world=4 2025-12-04T13:21:31.4938465Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4938641Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4938929Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4939083Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4939371Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4939495Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4939773Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4939921Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4940197Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4940344Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4940621Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4940759Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4941048Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4941197Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4941724Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 152064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T13:21:31.4941841Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4942035Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4942461Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4942589Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4942799Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4942973Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4943012Z dist init r=3, world=4 2025-12-04T13:21:31.4943149Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4943309Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4943598Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4943752Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4944036Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4944161Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4944438Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4944588Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4944865Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4945014Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4945298Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4945435Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4945712Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4945860Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4946397Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 150016 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472. 2025-12-04T13:21:31.4946512Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4946725Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4947139Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4947264Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4947476Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4947640Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4947681Z dist init r=1, world=4 2025-12-04T13:21:31.4947818Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4947978Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4948340Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4948496Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4948781Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4948905Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4949183Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4949331Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4949620Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4949768Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4950048Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4950183Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4950461Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4950610Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4951147Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256. 2025-12-04T13:21:31.4951273Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4951480Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4951892Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4952008Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4952218Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4952382Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4952421Z dist init r=2, world=4 2025-12-04T13:21:31.4952761Z [rank0]:[W1204 13:19:14.523981233 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4952799Z FAILED [7.7129s] [100%] 2025-12-04T13:21:31.4952802Z 2025-12-04T13:21:31.4952860Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4953006Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda _ 2025-12-04T13:21:31.4953052Z Traceback (most recent call last): 2025-12-04T13:21:31.4953216Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4953260Z self._join_processes(fn) 2025-12-04T13:21:31.4953434Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4953487Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4953675Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4953718Z raise RuntimeError(error) 2025-12-04T13:21:31.4953801Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4953846Z Traceback (most recent call last): 2025-12-04T13:21:31.4954008Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4954049Z getattr(self, test_name)() 2025-12-04T13:21:31.4954206Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4954241Z fn() 2025-12-04T13:21:31.4954393Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4954433Z method(*args, **kwargs) 2025-12-04T13:21:31.4954586Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4954625Z method(*args, **kwargs) 2025-12-04T13:21:31.4954784Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4954831Z with policy(): 2025-12-04T13:21:31.4954983Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4955022Z raise RuntimeError(msg) 2025-12-04T13:21:31.4955425Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 154112 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T13:21:31.4955437Z 2025-12-04T13:21:31.4955513Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4955797Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4955800Z 2025-12-04T13:21:31.4955890Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4955892Z 2025-12-04T13:21:31.4955950Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4955995Z Traceback (most recent call last): 2025-12-04T13:21:31.4956157Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4956200Z getattr(self, test_name)() 2025-12-04T13:21:31.4956358Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4956393Z fn() 2025-12-04T13:21:31.4956544Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4956584Z method(*args, **kwargs) 2025-12-04T13:21:31.4956735Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4956775Z method(*args, **kwargs) 2025-12-04T13:21:31.4956924Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4956961Z with policy(): 2025-12-04T13:21:31.4957111Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4957153Z raise RuntimeError(msg) 2025-12-04T13:21:31.4957563Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 152064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T13:21:31.4957566Z 2025-12-04T13:21:31.4957641Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4957923Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4957927Z 2025-12-04T13:21:31.4958015Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4958017Z 2025-12-04T13:21:31.4958019Z 2025-12-04T13:21:31.4958094Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4958220Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4958456Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7de742e840eb07e8.xml - 2025-12-04T13:21:31.4958517Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4958826Z FAILED [7.7129s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.4958885Z Traceback (most recent call last): 2025-12-04T13:21:31.4959047Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4959090Z getattr(self, test_name)() 2025-12-04T13:21:31.4959262Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4959297Z fn() 2025-12-04T13:21:31.4959449Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4959488Z method(*args, **kwargs) 2025-12-04T13:21:31.4959641Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4959680Z method(*args, **kwargs) 2025-12-04T13:21:31.4959832Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4959870Z with policy(): 2025-12-04T13:21:31.4960021Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4960062Z raise RuntimeError(msg) 2025-12-04T13:21:31.4960467Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 154112 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352. 2025-12-04T13:21:31.4960470Z 2025-12-04T13:21:31.4960543Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4960822Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4960825Z 2025-12-04T13:21:31.4960912Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4960914Z 2025-12-04T13:21:31.4960972Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4961016Z Traceback (most recent call last): 2025-12-04T13:21:31.4961181Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4961221Z getattr(self, test_name)() 2025-12-04T13:21:31.4961401Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4961434Z fn() 2025-12-04T13:21:31.4961586Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4961626Z method(*args, **kwargs) 2025-12-04T13:21:31.4961777Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4961815Z method(*args, **kwargs) 2025-12-04T13:21:31.4961965Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4962003Z with policy(): 2025-12-04T13:21:31.4962157Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4962197Z raise RuntimeError(msg) 2025-12-04T13:21:31.4962615Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 152064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608. 2025-12-04T13:21:31.4962627Z 2025-12-04T13:21:31.4962701Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4962978Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4962981Z 2025-12-04T13:21:31.4963078Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4963142Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4963205Z ======================= 1 failed, 18 deselected in 7.85s ======================= 2025-12-04T13:21:31.4963243Z Got exit code 1 2025-12-04T13:21:31.4963473Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda 2025-12-04T13:21:31.4963601Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.4963793Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4b555dfb546db2bb.xml 2025-12-04T13:21:31.4963852Z ============================= test session starts ============================== 2025-12-04T13:21:31.4963964Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4964006Z cachedir: .pytest_cache 2025-12-04T13:21:31.4964164Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4964211Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4964251Z configfile: pytest.ini 2025-12-04T13:21:31.4964414Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4964489Z collecting ... collected 60 items / 15 deselected / 45 selected 2025-12-04T13:21:31.4964544Z stepcurrent: skipping 15 already run items. 2025-12-04T13:21:31.4964587Z Running 4 items in this shard 2025-12-04T13:21:31.4964589Z 2025-12-04T13:21:31.4964943Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda I1204 13:19:18.930000 561634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 561703 2025-12-04T13:21:31.4965099Z I1204 13:19:18.931000 561634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 561704 2025-12-04T13:21:31.4965266Z I1204 13:19:18.931000 561634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 561705 2025-12-04T13:21:31.4965417Z I1204 13:19:18.932000 561634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 561706 2025-12-04T13:21:31.4965996Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4966036Z _warn_cpu_init() 2025-12-04T13:21:31.4966613Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4966660Z _warn_cpu_init() 2025-12-04T13:21:31.4967224Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4967272Z _warn_cpu_init() 2025-12-04T13:21:31.4967839Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4967875Z _warn_cpu_init() 2025-12-04T13:21:31.4968192Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4968234Z return func(*args, **kwargs) 2025-12-04T13:21:31.4968379Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4968541Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4968832Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4968988Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4969273Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4969399Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4969695Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4969846Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4970121Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4970269Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4970546Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4970685Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4970976Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4971135Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4971659Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:21:31.4971788Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4971987Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4972395Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.4972510Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4972722Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4972887Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.4972926Z dist init r=3, world=4 2025-12-04T13:21:31.4973065Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4973225Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4973512Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4973666Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4973963Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4974087Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4974363Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4974512Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4974788Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4974936Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4975224Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4975370Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4975647Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4975796Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4976328Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 55808 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:21:31.4976445Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4976640Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4977043Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.4977161Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4977372Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4977536Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.4977574Z dist init r=0, world=4 2025-12-04T13:21:31.4977713Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4977873Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4978221Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4978375Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4978660Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4978785Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4979060Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4979209Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4979497Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4979645Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4979934Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4980069Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4980371Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4980519Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4981041Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T13:21:31.4981155Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4981352Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4981757Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.4981870Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4982081Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4982244Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.4982283Z dist init r=2, world=4 2025-12-04T13:21:31.4982420Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4982591Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4982878Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4983033Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4983319Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4983443Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4983731Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4983878Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4984164Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4984310Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.4984595Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4984732Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.4985008Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4985158Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.4985681Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T13:21:31.4985797Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4985992Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4986394Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.4986508Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.4986719Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4986895Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.4986934Z dist init r=1, world=4 2025-12-04T13:21:31.4987270Z [rank0]:[W1204 13:19:25.808411610 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.4987310Z FAILED [7.5132s] [ 25%] 2025-12-04T13:21:31.4987312Z 2025-12-04T13:21:31.4987368Z =================================== FAILURES =================================== 2025-12-04T13:21:31.4987509Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda _ 2025-12-04T13:21:31.4987556Z Traceback (most recent call last): 2025-12-04T13:21:31.4987721Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.4987763Z self._join_processes(fn) 2025-12-04T13:21:31.4987948Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.4988012Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.4988228Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.4988271Z raise RuntimeError(error) 2025-12-04T13:21:31.4988352Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4988396Z Traceback (most recent call last): 2025-12-04T13:21:31.4988572Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4988613Z getattr(self, test_name)() 2025-12-04T13:21:31.4988774Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4988807Z fn() 2025-12-04T13:21:31.4988964Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4989004Z method(*args, **kwargs) 2025-12-04T13:21:31.4989156Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4989196Z method(*args, **kwargs) 2025-12-04T13:21:31.4989346Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4989384Z with policy(): 2025-12-04T13:21:31.4989538Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4989578Z raise RuntimeError(msg) 2025-12-04T13:21:31.4989978Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:21:31.4989981Z 2025-12-04T13:21:31.4990056Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4990333Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.4990335Z 2025-12-04T13:21:31.4990424Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4990427Z 2025-12-04T13:21:31.4990429Z 2025-12-04T13:21:31.4990504Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.4990610Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.4990844Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4b555dfb546db2bb.xml - 2025-12-04T13:21:31.4990905Z =========================== short test summary info ============================ 2025-12-04T13:21:31.4991198Z FAILED [7.5132s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.4991245Z Traceback (most recent call last): 2025-12-04T13:21:31.4991409Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4991451Z getattr(self, test_name)() 2025-12-04T13:21:31.4991611Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4991645Z fn() 2025-12-04T13:21:31.4991817Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4991857Z method(*args, **kwargs) 2025-12-04T13:21:31.4992022Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4992061Z method(*args, **kwargs) 2025-12-04T13:21:31.4992211Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.4992247Z with policy(): 2025-12-04T13:21:31.4992398Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.4992454Z raise RuntimeError(msg) 2025-12-04T13:21:31.4992853Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:21:31.4992855Z 2025-12-04T13:21:31.4992928Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.4993205Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.4993207Z 2025-12-04T13:21:31.4993294Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.4993358Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.4993420Z ======================= 1 failed, 15 deselected in 7.65s ======================= 2025-12-04T13:21:31.4993457Z Got exit code 1 2025-12-04T13:21:31.4993497Z Retrying single test... 2025-12-04T13:21:31.4993689Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-078ce9761d4e414e.xml 2025-12-04T13:21:31.4993747Z ============================= test session starts ============================== 2025-12-04T13:21:31.4993858Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.4993900Z cachedir: .pytest_cache 2025-12-04T13:21:31.4994058Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.4994103Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.4994144Z configfile: pytest.ini 2025-12-04T13:21:31.4994307Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.4994382Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.4994664Z stepcurrent: skipping 15 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.4994708Z Running 1 items in this shard 2025-12-04T13:21:31.4994711Z 2025-12-04T13:21:31.4995060Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda I1204 13:19:29.097000 562036 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 562105 2025-12-04T13:21:31.4995215Z I1204 13:19:29.098000 562036 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 562106 2025-12-04T13:21:31.4995367Z I1204 13:19:29.098000 562036 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 562107 2025-12-04T13:21:31.4995519Z I1204 13:19:29.099000 562036 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 562108 2025-12-04T13:21:31.4996106Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4996154Z _warn_cpu_init() 2025-12-04T13:21:31.4996723Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4996770Z _warn_cpu_init() 2025-12-04T13:21:31.4997337Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4997374Z _warn_cpu_init() 2025-12-04T13:21:31.4997937Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.4997976Z _warn_cpu_init() 2025-12-04T13:21:31.4998336Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.4998380Z return func(*args, **kwargs) 2025-12-04T13:21:31.4998522Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.4998684Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.4998990Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.4999149Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.4999433Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.4999560Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.4999838Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.4999988Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5000277Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5000435Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5000711Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5000860Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5001139Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5001289Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5001815Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:21:31.5001932Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5002126Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5002532Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5002649Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5002860Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5003025Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5003063Z dist init r=0, world=4 2025-12-04T13:21:31.5003212Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5003373Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5003660Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5003814Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5004099Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5004224Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5004513Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5004671Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5004946Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5005092Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5005377Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5005514Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5005790Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5005940Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5006460Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T13:21:31.5006576Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5006772Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5007174Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5007289Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5007510Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5007675Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5007714Z dist init r=1, world=4 2025-12-04T13:21:31.5007856Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5008017Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5008531Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5008685Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5008984Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5009108Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5009404Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5009552Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5009842Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5009987Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5010265Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5010401Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5010677Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5010826Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5011346Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 55808 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T13:21:31.5011462Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5011656Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5012071Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5012185Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5012398Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5012563Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5012600Z dist init r=2, world=4 2025-12-04T13:21:31.5012738Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5012897Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5013194Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5013347Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5013642Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5013764Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5014051Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5014201Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5014477Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5014625Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5014899Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5015037Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5015315Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5015465Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5015985Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:21:31.5016100Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5016309Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5016711Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5016824Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5017034Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5017199Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5017239Z dist init r=3, world=4 2025-12-04T13:21:31.5019650Z [rank0]:[W1204 13:19:35.936786860 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.5019713Z FAILED [7.6145s] [100%] 2025-12-04T13:21:31.5019716Z 2025-12-04T13:21:31.5019777Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5019918Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda _ 2025-12-04T13:21:31.5019966Z Traceback (most recent call last): 2025-12-04T13:21:31.5020144Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5020190Z self._join_processes(fn) 2025-12-04T13:21:31.5020365Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5020418Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5020599Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5020642Z raise RuntimeError(error) 2025-12-04T13:21:31.5020725Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.5020770Z Traceback (most recent call last): 2025-12-04T13:21:31.5020933Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5020977Z getattr(self, test_name)() 2025-12-04T13:21:31.5021136Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5021171Z fn() 2025-12-04T13:21:31.5021324Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5021364Z method(*args, **kwargs) 2025-12-04T13:21:31.5021519Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5021559Z method(*args, **kwargs) 2025-12-04T13:21:31.5021710Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5021747Z with policy(): 2025-12-04T13:21:31.5021899Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5021941Z raise RuntimeError(msg) 2025-12-04T13:21:31.5022356Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:21:31.5022358Z 2025-12-04T13:21:31.5022436Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5022715Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5022718Z 2025-12-04T13:21:31.5022808Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5022810Z 2025-12-04T13:21:31.5022812Z 2025-12-04T13:21:31.5022889Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5022979Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5023218Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-078ce9761d4e414e.xml - 2025-12-04T13:21:31.5023280Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5023582Z FAILED [7.6145s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.5023639Z Traceback (most recent call last): 2025-12-04T13:21:31.5023803Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5023846Z getattr(self, test_name)() 2025-12-04T13:21:31.5024006Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5024050Z fn() 2025-12-04T13:21:31.5024203Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5024244Z method(*args, **kwargs) 2025-12-04T13:21:31.5024396Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5024435Z method(*args, **kwargs) 2025-12-04T13:21:31.5024586Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5024622Z with policy(): 2025-12-04T13:21:31.5024774Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5024813Z raise RuntimeError(msg) 2025-12-04T13:21:31.5025214Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:21:31.5025218Z 2025-12-04T13:21:31.5025292Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5025572Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5025575Z 2025-12-04T13:21:31.5025663Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5025727Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5025789Z ======================= 1 failed, 18 deselected in 7.75s ======================= 2025-12-04T13:21:31.5025828Z Got exit code 1 2025-12-04T13:21:31.5025868Z Retrying single test... 2025-12-04T13:21:31.5026060Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9f2bd3f7b2fc9639.xml 2025-12-04T13:21:31.5026129Z ============================= test session starts ============================== 2025-12-04T13:21:31.5026243Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5026284Z cachedir: .pytest_cache 2025-12-04T13:21:31.5026444Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5026490Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5026531Z configfile: pytest.ini 2025-12-04T13:21:31.5026696Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5026771Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.5027042Z stepcurrent: skipping 15 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5027086Z Running 1 items in this shard 2025-12-04T13:21:31.5027088Z 2025-12-04T13:21:31.5027459Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda I1204 13:19:39.227000 562438 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 562507 2025-12-04T13:21:31.5027626Z I1204 13:19:39.228000 562438 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 562508 2025-12-04T13:21:31.5027777Z I1204 13:19:39.228000 562438 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 562509 2025-12-04T13:21:31.5027937Z I1204 13:19:39.229000 562438 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 562510 2025-12-04T13:21:31.5028559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5028599Z _warn_cpu_init() 2025-12-04T13:21:31.5029165Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5029204Z _warn_cpu_init() 2025-12-04T13:21:31.5029771Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5029807Z _warn_cpu_init() 2025-12-04T13:21:31.5030377Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5030414Z _warn_cpu_init() 2025-12-04T13:21:31.5030722Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.5030766Z return func(*args, **kwargs) 2025-12-04T13:21:31.5030908Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5031071Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5031359Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5031516Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5031813Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5031953Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5032235Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5032399Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5032677Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5032824Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5033100Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5033238Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5033516Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5033666Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5034190Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T13:21:31.5034307Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5034503Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5034922Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5035037Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5035251Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5035416Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5035455Z dist init r=1, world=4 2025-12-04T13:21:31.5035594Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5035753Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5036051Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5036216Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5036502Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5036636Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5036916Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5037066Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5037341Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5037490Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5037764Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5037901Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5038215Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5038365Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5038887Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:21:31.5039003Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5039212Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5039619Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5039735Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5039946Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5040111Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5040150Z dist init r=3, world=4 2025-12-04T13:21:31.5040299Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5040459Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5040757Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5040913Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5041213Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5041338Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5041617Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5041768Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5042045Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5042192Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5042473Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5042609Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5042886Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5043034Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5043572Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T13:21:31.5043688Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5043886Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5044290Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5044403Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5044627Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5044791Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5044840Z dist init r=2, world=4 2025-12-04T13:21:31.5044977Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5045136Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5045432Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5045586Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5045872Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5045996Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5046275Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5046424Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5046701Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5046849Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5047125Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5047261Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5047540Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5047699Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5048265Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:21:31.5048380Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5048577Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5048994Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5049122Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5049332Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5049496Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5049545Z dist init r=0, world=4 2025-12-04T13:21:31.5049885Z [rank0]:[W1204 13:19:45.468384433 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.5049925Z FAILED [7.8138s] [100%] 2025-12-04T13:21:31.5049927Z 2025-12-04T13:21:31.5049984Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5050126Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda _ 2025-12-04T13:21:31.5050172Z Traceback (most recent call last): 2025-12-04T13:21:31.5050336Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5050378Z self._join_processes(fn) 2025-12-04T13:21:31.5050553Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5050606Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5050787Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5050830Z raise RuntimeError(error) 2025-12-04T13:21:31.5050912Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.5050957Z Traceback (most recent call last): 2025-12-04T13:21:31.5051118Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5051159Z getattr(self, test_name)() 2025-12-04T13:21:31.5051317Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5051352Z fn() 2025-12-04T13:21:31.5051503Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5051544Z method(*args, **kwargs) 2025-12-04T13:21:31.5051713Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5051753Z method(*args, **kwargs) 2025-12-04T13:21:31.5051905Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5051944Z with policy(): 2025-12-04T13:21:31.5052095Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5052136Z raise RuntimeError(msg) 2025-12-04T13:21:31.5052534Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:21:31.5052537Z 2025-12-04T13:21:31.5052613Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5052904Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5052906Z 2025-12-04T13:21:31.5053006Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5053008Z 2025-12-04T13:21:31.5053010Z 2025-12-04T13:21:31.5053086Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5053173Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5053407Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9f2bd3f7b2fc9639.xml - 2025-12-04T13:21:31.5053479Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5053768Z FAILED [7.8138s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.5053815Z Traceback (most recent call last): 2025-12-04T13:21:31.5053978Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5054021Z getattr(self, test_name)() 2025-12-04T13:21:31.5054180Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5054214Z fn() 2025-12-04T13:21:31.5054366Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5054406Z method(*args, **kwargs) 2025-12-04T13:21:31.5054557Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5054597Z method(*args, **kwargs) 2025-12-04T13:21:31.5054748Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5054785Z with policy(): 2025-12-04T13:21:31.5054938Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5054978Z raise RuntimeError(msg) 2025-12-04T13:21:31.5055376Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:21:31.5055379Z 2025-12-04T13:21:31.5055453Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5055740Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5055743Z 2025-12-04T13:21:31.5055832Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5055896Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5055958Z ======================= 1 failed, 18 deselected in 7.95s ======================= 2025-12-04T13:21:31.5055995Z Got exit code 1 2025-12-04T13:21:31.5056219Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda 2025-12-04T13:21:31.5056347Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.5056538Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8ed907b9022c6610.xml 2025-12-04T13:21:31.5056596Z ============================= test session starts ============================== 2025-12-04T13:21:31.5056720Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5056773Z cachedir: .pytest_cache 2025-12-04T13:21:31.5056930Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5056977Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5057017Z configfile: pytest.ini 2025-12-04T13:21:31.5057179Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5057271Z collecting ... collected 60 items / 16 deselected / 44 selected 2025-12-04T13:21:31.5057325Z stepcurrent: skipping 16 already run items. 2025-12-04T13:21:31.5057368Z Running 3 items in this shard 2025-12-04T13:21:31.5057370Z 2025-12-04T13:21:31.5057682Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda I1204 13:19:49.712000 562840 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 562909 2025-12-04T13:21:31.5057836Z I1204 13:19:49.713000 562840 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 562910 2025-12-04T13:21:31.5057989Z I1204 13:19:49.714000 562840 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 562911 2025-12-04T13:21:31.5058138Z I1204 13:19:49.714000 562840 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 562912 2025-12-04T13:21:31.5058554Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5058604Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5058960Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5059009Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5059360Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5059407Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5059777Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5059823Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5060403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5060443Z _warn_cpu_init() 2025-12-04T13:21:31.5061026Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5061064Z _warn_cpu_init() 2025-12-04T13:21:31.5061640Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5061689Z _warn_cpu_init() 2025-12-04T13:21:31.5062258Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5062296Z _warn_cpu_init() 2025-12-04T13:21:31.5062585Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.5062628Z return func(*args, **kwargs) 2025-12-04T13:21:31.5062770Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5062934Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5063225Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5063380Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5063667Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5063792Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5064071Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5064235Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5064516Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5064665Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5064941Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5065080Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5065366Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5065515Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5066007Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T13:21:31.5066135Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5066336Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5066699Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5066815Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5067027Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5067193Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5067232Z dist init r=2, world=4 2025-12-04T13:21:31.5067371Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5067531Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5067819Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5067972Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5068315Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5068459Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5068738Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5068887Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5069162Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5069310Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5069585Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5069737Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5070027Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5070175Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5070670Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:21:31.5070787Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5070984Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5071345Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5071461Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5071676Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5071840Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5071880Z dist init r=0, world=4 2025-12-04T13:21:31.5072017Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5072177Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5072464Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5072620Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5072920Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5073045Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5073325Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5073472Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5073750Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5073909Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5074184Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5074339Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5074615Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5074775Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5075254Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T13:21:31.5075370Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5075565Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5075929Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5076042Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5076255Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5076420Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5076457Z dist init r=3, world=4 2025-12-04T13:21:31.5076594Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5076754Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5077058Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5077213Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5077498Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5077623Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5077902Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5078051Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5078380Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5078540Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5078815Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5078965Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5079244Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5079393Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5079870Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:21:31.5079985Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5080182Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5080545Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5080659Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5080870Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5081034Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5081072Z dist init r=1, world=4 2025-12-04T13:21:31.5081422Z [rank0]:[W1204 13:19:59.959437921 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.5081465Z FAILED [11.5186s] [ 33%] 2025-12-04T13:21:31.5081467Z 2025-12-04T13:21:31.5081524Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5081626Z ___ TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda ____ 2025-12-04T13:21:31.5081672Z Traceback (most recent call last): 2025-12-04T13:21:31.5081835Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5081880Z self._join_processes(fn) 2025-12-04T13:21:31.5082052Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5082106Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5082284Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5082345Z raise RuntimeError(error) 2025-12-04T13:21:31.5082425Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.5082483Z Traceback (most recent call last): 2025-12-04T13:21:31.5082646Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5082688Z getattr(self, test_name)() 2025-12-04T13:21:31.5082846Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5082891Z fn() 2025-12-04T13:21:31.5083042Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5083083Z method(*args, **kwargs) 2025-12-04T13:21:31.5083234Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5083274Z method(*args, **kwargs) 2025-12-04T13:21:31.5083425Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5083463Z with policy(): 2025-12-04T13:21:31.5083614Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5083656Z raise RuntimeError(msg) 2025-12-04T13:21:31.5084009Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:21:31.5084013Z 2025-12-04T13:21:31.5084089Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5084323Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5084326Z 2025-12-04T13:21:31.5084415Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5084417Z 2025-12-04T13:21:31.5084478Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.5084523Z Traceback (most recent call last): 2025-12-04T13:21:31.5084686Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5084728Z getattr(self, test_name)() 2025-12-04T13:21:31.5084889Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5084922Z fn() 2025-12-04T13:21:31.5085083Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5085122Z method(*args, **kwargs) 2025-12-04T13:21:31.5085275Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5085315Z method(*args, **kwargs) 2025-12-04T13:21:31.5085465Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5085502Z with policy(): 2025-12-04T13:21:31.5085653Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5085694Z raise RuntimeError(msg) 2025-12-04T13:21:31.5086046Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T13:21:31.5086049Z 2025-12-04T13:21:31.5086122Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5086361Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5086373Z 2025-12-04T13:21:31.5086462Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5086464Z 2025-12-04T13:21:31.5086523Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.5086568Z Traceback (most recent call last): 2025-12-04T13:21:31.5086739Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5086781Z getattr(self, test_name)() 2025-12-04T13:21:31.5086940Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5086974Z fn() 2025-12-04T13:21:31.5087126Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5087166Z method(*args, **kwargs) 2025-12-04T13:21:31.5087317Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5087356Z method(*args, **kwargs) 2025-12-04T13:21:31.5087507Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5087543Z with policy(): 2025-12-04T13:21:31.5087695Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5087735Z raise RuntimeError(msg) 2025-12-04T13:21:31.5088090Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T13:21:31.5088093Z 2025-12-04T13:21:31.5088206Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5088437Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5088439Z 2025-12-04T13:21:31.5088527Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5088530Z 2025-12-04T13:21:31.5088532Z 2025-12-04T13:21:31.5088608Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5088696Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5088953Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8ed907b9022c6610.xml - 2025-12-04T13:21:31.5089016Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5089264Z FAILED [11.5186s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.5089312Z Traceback (most recent call last): 2025-12-04T13:21:31.5089476Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5089518Z getattr(self, test_name)() 2025-12-04T13:21:31.5089678Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5089712Z fn() 2025-12-04T13:21:31.5089864Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5089904Z method(*args, **kwargs) 2025-12-04T13:21:31.5090075Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5090127Z method(*args, **kwargs) 2025-12-04T13:21:31.5090276Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5090314Z with policy(): 2025-12-04T13:21:31.5090465Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5090518Z raise RuntimeError(msg) 2025-12-04T13:21:31.5090868Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:21:31.5090872Z 2025-12-04T13:21:31.5090944Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5091175Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5091178Z 2025-12-04T13:21:31.5091264Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5091267Z 2025-12-04T13:21:31.5091325Z Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.5091369Z Traceback (most recent call last): 2025-12-04T13:21:31.5091534Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5091575Z getattr(self, test_name)() 2025-12-04T13:21:31.5091736Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5091770Z fn() 2025-12-04T13:21:31.5091922Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5091961Z method(*args, **kwargs) 2025-12-04T13:21:31.5092112Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5092150Z method(*args, **kwargs) 2025-12-04T13:21:31.5092300Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5092336Z with policy(): 2025-12-04T13:21:31.5092488Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5092528Z raise RuntimeError(msg) 2025-12-04T13:21:31.5092894Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T13:21:31.5092896Z 2025-12-04T13:21:31.5092971Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5093199Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5093202Z 2025-12-04T13:21:31.5093289Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5093292Z 2025-12-04T13:21:31.5093350Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.5093395Z Traceback (most recent call last): 2025-12-04T13:21:31.5093558Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5093600Z getattr(self, test_name)() 2025-12-04T13:21:31.5093768Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5093803Z fn() 2025-12-04T13:21:31.5093963Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5094003Z method(*args, **kwargs) 2025-12-04T13:21:31.5094153Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5094193Z method(*args, **kwargs) 2025-12-04T13:21:31.5094353Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5094389Z with policy(): 2025-12-04T13:21:31.5094542Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5094583Z raise RuntimeError(msg) 2025-12-04T13:21:31.5094935Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T13:21:31.5094938Z 2025-12-04T13:21:31.5095009Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5095238Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5095242Z 2025-12-04T13:21:31.5095327Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5095394Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5095460Z ====================== 1 failed, 16 deselected in 11.66s ======================= 2025-12-04T13:21:31.5095497Z Got exit code 1 2025-12-04T13:21:31.5095536Z Retrying single test... 2025-12-04T13:21:31.5095729Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-84cd3a84cc53b053.xml 2025-12-04T13:21:31.5095787Z ============================= test session starts ============================== 2025-12-04T13:21:31.5095899Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5095940Z cachedir: .pytest_cache 2025-12-04T13:21:31.5096098Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5096145Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5096185Z configfile: pytest.ini 2025-12-04T13:21:31.5096364Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5096439Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.5096665Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5096710Z Running 1 items in this shard 2025-12-04T13:21:31.5096712Z 2025-12-04T13:21:31.5097019Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda I1204 13:20:03.663000 563242 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 563311 2025-12-04T13:21:31.5097173Z I1204 13:20:03.663000 563242 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 563312 2025-12-04T13:21:31.5097326Z I1204 13:20:03.664000 563242 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 563313 2025-12-04T13:21:31.5097476Z I1204 13:20:03.664000 563242 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 563314 2025-12-04T13:21:31.5097848Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5097908Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5098300Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5098363Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5098716Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5098763Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5099113Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5099159Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5099737Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5099776Z _warn_cpu_init() 2025-12-04T13:21:31.5100343Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5100380Z _warn_cpu_init() 2025-12-04T13:21:31.5100959Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5100999Z _warn_cpu_init() 2025-12-04T13:21:31.5101564Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5101603Z _warn_cpu_init() 2025-12-04T13:21:31.5101893Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.5101936Z return func(*args, **kwargs) 2025-12-04T13:21:31.5102080Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5102255Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5102557Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5102711Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5103008Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5103134Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5103413Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5103564Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5103843Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5103992Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5104268Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5104406Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5104684Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5104833Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5105328Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:21:31.5105446Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5105641Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5106004Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5106120Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5106333Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5106508Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5106558Z dist init r=1, world=4 2025-12-04T13:21:31.5106697Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5106856Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5107143Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5107312Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5107597Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5107723Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5107998Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5108194Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5108471Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5108620Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5108894Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5109031Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5109310Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5109469Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5109947Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T13:21:31.5110063Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5110258Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5110623Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5110748Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5110960Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5111136Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5111175Z dist init r=3, world=4 2025-12-04T13:21:31.5111311Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5111486Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5111773Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5111927Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5112212Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5112336Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5112614Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5112762Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5113041Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5113188Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5113463Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5113600Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5113887Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5114035Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5114512Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:21:31.5114627Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5114822Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5115192Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5115318Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5115528Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5115704Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5115742Z dist init r=0, world=4 2025-12-04T13:21:31.5115880Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5116040Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5116327Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5116480Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5116765Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5116890Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5117165Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5117314Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5117590Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5117738Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5118020Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5118191Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5118469Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5118617Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5119095Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T13:21:31.5119226Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5119423Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5119793Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5119929Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5120142Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5120306Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5120345Z dist init r=2, world=4 2025-12-04T13:21:31.5120681Z [rank0]:[W1204 13:20:13.965978040 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.5120722Z FAILED [11.2154s] [100%] 2025-12-04T13:21:31.5120724Z 2025-12-04T13:21:31.5120781Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5120886Z ___ TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda ____ 2025-12-04T13:21:31.5120932Z Traceback (most recent call last): 2025-12-04T13:21:31.5121097Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5121140Z self._join_processes(fn) 2025-12-04T13:21:31.5121314Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5121369Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5121548Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5121591Z raise RuntimeError(error) 2025-12-04T13:21:31.5121672Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.5121717Z Traceback (most recent call last): 2025-12-04T13:21:31.5121879Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5121922Z getattr(self, test_name)() 2025-12-04T13:21:31.5122092Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5122128Z fn() 2025-12-04T13:21:31.5122283Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5122324Z method(*args, **kwargs) 2025-12-04T13:21:31.5122474Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5122516Z method(*args, **kwargs) 2025-12-04T13:21:31.5122665Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5122704Z with policy(): 2025-12-04T13:21:31.5122855Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5122897Z raise RuntimeError(msg) 2025-12-04T13:21:31.5123260Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:21:31.5123273Z 2025-12-04T13:21:31.5123349Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5123580Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5123582Z 2025-12-04T13:21:31.5123671Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5123683Z 2025-12-04T13:21:31.5123685Z 2025-12-04T13:21:31.5123760Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5123848Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5124083Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-84cd3a84cc53b053.xml - 2025-12-04T13:21:31.5124144Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5124397Z FAILED [11.2154s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.5124442Z Traceback (most recent call last): 2025-12-04T13:21:31.5124607Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5124650Z getattr(self, test_name)() 2025-12-04T13:21:31.5124810Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5124844Z fn() 2025-12-04T13:21:31.5124996Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5125037Z method(*args, **kwargs) 2025-12-04T13:21:31.5125188Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5125228Z method(*args, **kwargs) 2025-12-04T13:21:31.5125379Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5125415Z with policy(): 2025-12-04T13:21:31.5125567Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5125609Z raise RuntimeError(msg) 2025-12-04T13:21:31.5125971Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:21:31.5125974Z 2025-12-04T13:21:31.5126050Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5126280Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5126282Z 2025-12-04T13:21:31.5126370Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5126432Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5126496Z ====================== 1 failed, 18 deselected in 11.36s ======================= 2025-12-04T13:21:31.5126533Z Got exit code 1 2025-12-04T13:21:31.5126574Z Retrying single test... 2025-12-04T13:21:31.5126764Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8fb47012cbd58a54.xml 2025-12-04T13:21:31.5126821Z ============================= test session starts ============================== 2025-12-04T13:21:31.5126942Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5126993Z cachedir: .pytest_cache 2025-12-04T13:21:31.5127149Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5127196Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5127235Z configfile: pytest.ini 2025-12-04T13:21:31.5127398Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5127484Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.5127708Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5127753Z Running 1 items in this shard 2025-12-04T13:21:31.5127755Z 2025-12-04T13:21:31.5128062Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda I1204 13:20:17.507000 563644 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 563713 2025-12-04T13:21:31.5128250Z I1204 13:20:17.507000 563644 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 563714 2025-12-04T13:21:31.5128402Z I1204 13:20:17.508000 563644 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 563715 2025-12-04T13:21:31.5128554Z I1204 13:20:17.509000 563644 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 563716 2025-12-04T13:21:31.5128913Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5128963Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5129317Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5129364Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5129715Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5129761Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5130126Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5130171Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5130746Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5130785Z _warn_cpu_init() 2025-12-04T13:21:31.5131364Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5131413Z _warn_cpu_init() 2025-12-04T13:21:31.5131981Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5132030Z _warn_cpu_init() 2025-12-04T13:21:31.5132598Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:21:31.5132634Z _warn_cpu_init() 2025-12-04T13:21:31.5132926Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:21:31.5132968Z return func(*args, **kwargs) 2025-12-04T13:21:31.5133112Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5133275Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5133566Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5133720Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5134009Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5134136Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5134421Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5134572Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5134849Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5134996Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5135273Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5135413Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5135700Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5135865Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5136348Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T13:21:31.5136475Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5136671Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5137033Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5137147Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5137360Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5137524Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5137563Z dist init r=2, world=4 2025-12-04T13:21:31.5137702Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5137862Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5138173Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5138330Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5138632Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5138759Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5139035Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5139183Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5139458Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5139606Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5139893Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5140041Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5140318Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5140481Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5140963Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:21:31.5141082Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5141280Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5141638Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5141753Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5141966Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5142130Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5142168Z dist init r=0, world=4 2025-12-04T13:21:31.5142306Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5142465Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5142753Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5142920Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5143208Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5143334Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5143610Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5143759Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5144034Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5144192Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5144475Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5144611Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5144896Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5145046Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5145526Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:21:31.5145642Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5145839Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5146195Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5146309Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5146522Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5146686Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5146825Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5146984Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5147279Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5147432Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5147717Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5147840Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5148120Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5148313Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5148602Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5148762Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5149036Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5149184Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5149462Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5149612Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5150088Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T13:21:31.5150204Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5150402Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5150759Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5150873Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5151085Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5151251Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5151290Z dist init r=1, world=4 2025-12-04T13:21:31.5151339Z dist init r=3, world=4 2025-12-04T13:21:31.5151677Z [rank0]:[W1204 13:20:27.745944812 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:21:31.5151717Z FAILED [11.3165s] [100%] 2025-12-04T13:21:31.5151720Z 2025-12-04T13:21:31.5151778Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5151878Z ___ TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda ____ 2025-12-04T13:21:31.5151923Z Traceback (most recent call last): 2025-12-04T13:21:31.5152087Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5152131Z self._join_processes(fn) 2025-12-04T13:21:31.5152304Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5152359Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5152555Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5152608Z raise RuntimeError(error) 2025-12-04T13:21:31.5152689Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.5152734Z Traceback (most recent call last): 2025-12-04T13:21:31.5152895Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5152947Z getattr(self, test_name)() 2025-12-04T13:21:31.5153106Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5153141Z fn() 2025-12-04T13:21:31.5153294Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5153333Z method(*args, **kwargs) 2025-12-04T13:21:31.5153486Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5153525Z method(*args, **kwargs) 2025-12-04T13:21:31.5153676Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5153712Z with policy(): 2025-12-04T13:21:31.5153864Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5153904Z raise RuntimeError(msg) 2025-12-04T13:21:31.5154259Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:21:31.5154261Z 2025-12-04T13:21:31.5154336Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5154568Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5154571Z 2025-12-04T13:21:31.5154660Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5154663Z 2025-12-04T13:21:31.5154665Z 2025-12-04T13:21:31.5154741Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5154830Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5155064Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8fb47012cbd58a54.xml - 2025-12-04T13:21:31.5155137Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5155385Z FAILED [11.3165s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.5155432Z Traceback (most recent call last): 2025-12-04T13:21:31.5155596Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5155638Z getattr(self, test_name)() 2025-12-04T13:21:31.5155797Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5155833Z fn() 2025-12-04T13:21:31.5155984Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5156025Z method(*args, **kwargs) 2025-12-04T13:21:31.5156177Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5156217Z method(*args, **kwargs) 2025-12-04T13:21:31.5156376Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5156423Z with policy(): 2025-12-04T13:21:31.5156575Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5156616Z raise RuntimeError(msg) 2025-12-04T13:21:31.5156968Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:21:31.5156980Z 2025-12-04T13:21:31.5157055Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5157290Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5157292Z 2025-12-04T13:21:31.5157379Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5157444Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5157506Z ====================== 1 failed, 18 deselected in 11.46s ======================= 2025-12-04T13:21:31.5157543Z Got exit code 1 2025-12-04T13:21:31.5157722Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda 2025-12-04T13:21:31.5157851Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.5158042Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5aedcf6ff3ae4698.xml 2025-12-04T13:21:31.5158100Z ============================= test session starts ============================== 2025-12-04T13:21:31.5158258Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5158301Z cachedir: .pytest_cache 2025-12-04T13:21:31.5158458Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5158503Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5158544Z configfile: pytest.ini 2025-12-04T13:21:31.5158705Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5158781Z collecting ... collected 60 items / 17 deselected / 43 selected 2025-12-04T13:21:31.5158835Z stepcurrent: skipping 17 already run items. 2025-12-04T13:21:31.5158879Z Running 2 items in this shard 2025-12-04T13:21:31.5158881Z 2025-12-04T13:21:31.5159197Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda I1204 13:20:31.409000 564046 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 564115 2025-12-04T13:21:31.5159353Z I1204 13:20:31.409000 564046 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 564116 2025-12-04T13:21:31.5159507Z I1204 13:20:31.410000 564046 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 564117 2025-12-04T13:21:31.5159657Z I1204 13:20:31.410000 564046 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 564118 2025-12-04T13:21:31.5160017Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5160066Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5160370Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5160448Z {} 2025-12-04T13:21:31.5160555Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5160630Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5161125Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5161200Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5161556Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5161605Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5161894Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5161959Z {} 2025-12-04T13:21:31.5162063Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5162138Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5162628Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5162691Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5163043Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5163092Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5163389Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5163454Z {} 2025-12-04T13:21:31.5163556Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5163629Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5164121Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5164181Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5164548Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5164594Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5164891Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5164953Z {} 2025-12-04T13:21:31.5165055Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5165136Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5165624Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5165683Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5165827Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5165990Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5166279Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5166437Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5166726Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5166852Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5167131Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5167280Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5167575Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5167723Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5167999Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5168136Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5168460Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5168611Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5169100Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:21:31.5169232Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5169427Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5169789Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5169903Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5170117Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5170282Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5170320Z dist init r=3, world=4 2025-12-04T13:21:31.5170459Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5170619Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5170907Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5171062Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5171349Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5171474Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5171750Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5171912Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5172187Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5172336Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5172610Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5172748Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5173026Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5173184Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5173662Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400. 2025-12-04T13:21:31.5173787Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5173985Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5174330Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5174446Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5174658Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5174823Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5174862Z dist init r=1, world=4 2025-12-04T13:21:31.5175001Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5175161Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5175447Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5175602Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5175887Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5176014Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5176301Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5176450Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5176725Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5176872Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5177147Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5177292Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5177570Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5177728Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5178245Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280. 2025-12-04T13:21:31.5178373Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5178571Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5178918Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5179031Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5179244Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5179409Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5179449Z dist init r=0, world=4 2025-12-04T13:21:31.5179586Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5179746Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5180033Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5180187Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5180484Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5180610Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5180887Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5181033Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5181311Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5181460Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5181746Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5181895Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5182171Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5182337Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5182804Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184. 2025-12-04T13:21:31.5182921Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5183116Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5183459Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5183575Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5183786Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5183951Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5183989Z dist init r=2, world=4 2025-12-04T13:21:31.5184027Z FAILED [6.9142s] [ 50%] 2025-12-04T13:21:31.5184029Z 2025-12-04T13:21:31.5184086Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5184183Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda _______ 2025-12-04T13:21:31.5184229Z Traceback (most recent call last): 2025-12-04T13:21:31.5184392Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5184458Z self._join_processes(fn) 2025-12-04T13:21:31.5184631Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5184686Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5184865Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5184909Z raise RuntimeError(error) 2025-12-04T13:21:31.5184989Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.5185034Z Traceback (most recent call last): 2025-12-04T13:21:31.5185197Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5185240Z getattr(self, test_name)() 2025-12-04T13:21:31.5185399Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5185434Z fn() 2025-12-04T13:21:31.5185595Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5185636Z method(*args, **kwargs) 2025-12-04T13:21:31.5185798Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5185838Z method(*args, **kwargs) 2025-12-04T13:21:31.5185988Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5186026Z with policy(): 2025-12-04T13:21:31.5186187Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5186230Z raise RuntimeError(msg) 2025-12-04T13:21:31.5186574Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:21:31.5186578Z 2025-12-04T13:21:31.5186653Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5186873Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5186875Z 2025-12-04T13:21:31.5186964Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5186966Z 2025-12-04T13:21:31.5186968Z 2025-12-04T13:21:31.5187045Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5187132Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5187368Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5aedcf6ff3ae4698.xml - 2025-12-04T13:21:31.5187429Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5187668Z FAILED [6.9142s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.5187715Z Traceback (most recent call last): 2025-12-04T13:21:31.5187880Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5187922Z getattr(self, test_name)() 2025-12-04T13:21:31.5188083Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5188118Z fn() 2025-12-04T13:21:31.5188320Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5188361Z method(*args, **kwargs) 2025-12-04T13:21:31.5188513Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5188553Z method(*args, **kwargs) 2025-12-04T13:21:31.5188703Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5188740Z with policy(): 2025-12-04T13:21:31.5188891Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5188932Z raise RuntimeError(msg) 2025-12-04T13:21:31.5189275Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:21:31.5189277Z 2025-12-04T13:21:31.5189352Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5189583Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5189597Z 2025-12-04T13:21:31.5189685Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5189748Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5189810Z ======================= 1 failed, 17 deselected in 7.06s ======================= 2025-12-04T13:21:31.5189860Z Got exit code 1 2025-12-04T13:21:31.5189900Z Retrying single test... 2025-12-04T13:21:31.5190089Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4348f9a325949f23.xml 2025-12-04T13:21:31.5190147Z ============================= test session starts ============================== 2025-12-04T13:21:31.5190261Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5190302Z cachedir: .pytest_cache 2025-12-04T13:21:31.5190461Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5190507Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5190547Z configfile: pytest.ini 2025-12-04T13:21:31.5190709Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5190785Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.5191000Z stepcurrent: skipping 17 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5191044Z Running 1 items in this shard 2025-12-04T13:21:31.5191047Z 2025-12-04T13:21:31.5191341Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda I1204 13:20:40.948000 564440 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 564509 2025-12-04T13:21:31.5191497Z I1204 13:20:40.949000 564440 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 564510 2025-12-04T13:21:31.5191647Z I1204 13:20:40.950000 564440 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 564511 2025-12-04T13:21:31.5191798Z I1204 13:20:40.950000 564440 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 564512 2025-12-04T13:21:31.5192175Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5192224Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5192516Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5192581Z {} 2025-12-04T13:21:31.5192687Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5192761Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5193253Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5193315Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5193680Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5193740Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5194090Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5194148Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5194438Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5194503Z {} 2025-12-04T13:21:31.5194606Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5194680Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5194966Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5195028Z {} 2025-12-04T13:21:31.5195132Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5195205Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5195697Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5195757Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5196241Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5196299Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5196662Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5196709Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5197000Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5197062Z {} 2025-12-04T13:21:31.5197163Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5197236Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5197732Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5197792Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5197951Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5198115Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5198441Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5198614Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5198903Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5199028Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5199308Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5199458Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5199735Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5199884Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5200161Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5200299Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5200576Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5200738Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5201206Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280. 2025-12-04T13:21:31.5201324Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5201521Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5201871Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5202001Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5202212Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5202399Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5202437Z dist init r=0, world=4 2025-12-04T13:21:31.5202576Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5202748Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5203035Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5203190Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5203475Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5203600Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5203877Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5204026Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5204302Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5204450Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5204723Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5204861Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5205150Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5205299Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5205765Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:21:31.5205880Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5206078Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5206437Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5206564Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5206777Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5206950Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5206989Z dist init r=3, world=4 2025-12-04T13:21:31.5207128Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5207289Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5207574Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5207730Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5208013Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5208140Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5208454Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5208604Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5208879Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5209026Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5209312Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5209449Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5209726Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5209877Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5210340Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400. 2025-12-04T13:21:31.5210456Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5210664Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5211026Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5211139Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5211362Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5211527Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5211566Z dist init r=1, world=4 2025-12-04T13:21:31.5211704Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5211864Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5212150Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5212304Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5212589Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5212713Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5212992Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5213141Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5213418Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5213576Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5213852Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5213989Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5214265Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5214415Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5214900Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184. 2025-12-04T13:21:31.5215023Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5215218Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5215568Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5215698Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5215909Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5216074Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5216113Z dist init r=2, world=4 2025-12-04T13:21:31.5216151Z FAILED [7.4173s] [100%] 2025-12-04T13:21:31.5216156Z 2025-12-04T13:21:31.5216212Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5216309Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda _______ 2025-12-04T13:21:31.5216356Z Traceback (most recent call last): 2025-12-04T13:21:31.5216519Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5216563Z self._join_processes(fn) 2025-12-04T13:21:31.5216736Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5216791Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5216969Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5217012Z raise RuntimeError(error) 2025-12-04T13:21:31.5217092Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.5217137Z Traceback (most recent call last): 2025-12-04T13:21:31.5217298Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5217341Z getattr(self, test_name)() 2025-12-04T13:21:31.5217509Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5217544Z fn() 2025-12-04T13:21:31.5217698Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5217740Z method(*args, **kwargs) 2025-12-04T13:21:31.5217892Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5217933Z method(*args, **kwargs) 2025-12-04T13:21:31.5218083Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5218120Z with policy(): 2025-12-04T13:21:31.5218307Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5218347Z raise RuntimeError(msg) 2025-12-04T13:21:31.5218702Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400. 2025-12-04T13:21:31.5218704Z 2025-12-04T13:21:31.5218791Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5219010Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5219013Z 2025-12-04T13:21:31.5219101Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5219115Z 2025-12-04T13:21:31.5219117Z 2025-12-04T13:21:31.5219195Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5219283Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5219516Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4348f9a325949f23.xml - 2025-12-04T13:21:31.5219577Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5219812Z FAILED [7.4173s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.5219859Z Traceback (most recent call last): 2025-12-04T13:21:31.5220022Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5220065Z getattr(self, test_name)() 2025-12-04T13:21:31.5220225Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5220260Z fn() 2025-12-04T13:21:31.5220412Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5220452Z method(*args, **kwargs) 2025-12-04T13:21:31.5220603Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5220643Z method(*args, **kwargs) 2025-12-04T13:21:31.5220792Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5220829Z with policy(): 2025-12-04T13:21:31.5220980Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5221022Z raise RuntimeError(msg) 2025-12-04T13:21:31.5221375Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400. 2025-12-04T13:21:31.5221378Z 2025-12-04T13:21:31.5221452Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5221673Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5221676Z 2025-12-04T13:21:31.5221763Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5221826Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5221886Z ======================= 1 failed, 18 deselected in 7.56s ======================= 2025-12-04T13:21:31.5221924Z Got exit code 1 2025-12-04T13:21:31.5221964Z Retrying single test... 2025-12-04T13:21:31.5222152Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-140229431b9f8263.xml 2025-12-04T13:21:31.5222209Z ============================= test session starts ============================== 2025-12-04T13:21:31.5222331Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5222372Z cachedir: .pytest_cache 2025-12-04T13:21:31.5222541Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5222587Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5222627Z configfile: pytest.ini 2025-12-04T13:21:31.5222788Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5222875Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.5223087Z stepcurrent: skipping 17 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5223132Z Running 1 items in this shard 2025-12-04T13:21:31.5223134Z 2025-12-04T13:21:31.5223430Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda I1204 13:20:50.957000 564834 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 564903 2025-12-04T13:21:31.5223585Z I1204 13:20:50.958000 564834 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 564904 2025-12-04T13:21:31.5223737Z I1204 13:20:50.958000 564834 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 564905 2025-12-04T13:21:31.5223888Z I1204 13:20:50.959000 564834 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 564906 2025-12-04T13:21:31.5224249Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5224297Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5224589Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5224655Z {} 2025-12-04T13:21:31.5224761Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5224835Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5225342Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5225405Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5225759Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5225807Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5226093Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5226157Z {} 2025-12-04T13:21:31.5226260Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5226335Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5226832Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5226904Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5227261Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5227317Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5227604Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5227667Z {} 2025-12-04T13:21:31.5227770Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5227842Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5228376Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5228437Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5228791Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5228839Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5229125Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:21:31.5229187Z {} 2025-12-04T13:21:31.5229288Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:21:31.5229362Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:21:31.5229878Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5229939Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5230083Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5230247Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5230536Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5230692Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5230989Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5231127Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5231405Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5231566Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5231848Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5231997Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5232273Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5232411Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5232688Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5232838Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5233305Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280. 2025-12-04T13:21:31.5233422Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5233619Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5233976Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5234092Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5234306Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5234475Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5234513Z dist init r=0, world=4 2025-12-04T13:21:31.5234652Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5234812Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5235114Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5235268Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5235564Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5235690Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5235975Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5236124Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5236403Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5236554Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5236831Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5236968Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5237249Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5237398Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5237865Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400. 2025-12-04T13:21:31.5237982Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5238234Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5238581Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5238695Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5238908Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5239073Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5239113Z dist init r=1, world=4 2025-12-04T13:21:31.5239251Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5239424Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5239710Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5239876Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5240161Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5240299Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5240576Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5240724Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5240999Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5241148Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5241426Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5241563Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5241839Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5241988Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5242463Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184. 2025-12-04T13:21:31.5242579Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5242774Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5243120Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5243275Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5243507Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5243698Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5246097Z dist init r=2, world=4 2025-12-04T13:21:31.5246253Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5246425Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5246714Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5246878Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5247164Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5247287Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5247565Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5247711Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5247992Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5248248Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5248527Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5248663Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5248941Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5249090Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5249580Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:21:31.5249697Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5249892Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5250238Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5250352Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5250575Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5250754Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5250792Z dist init r=3, world=4 2025-12-04T13:21:31.5250831Z FAILED [6.9131s] [100%] 2025-12-04T13:21:31.5250834Z 2025-12-04T13:21:31.5250892Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5251005Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda _______ 2025-12-04T13:21:31.5251052Z Traceback (most recent call last): 2025-12-04T13:21:31.5251216Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5251260Z self._join_processes(fn) 2025-12-04T13:21:31.5251433Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5251489Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5251665Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5251709Z raise RuntimeError(error) 2025-12-04T13:21:31.5251790Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.5251835Z Traceback (most recent call last): 2025-12-04T13:21:31.5251998Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5252040Z getattr(self, test_name)() 2025-12-04T13:21:31.5252198Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5252233Z fn() 2025-12-04T13:21:31.5252385Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5252427Z method(*args, **kwargs) 2025-12-04T13:21:31.5252577Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5252616Z method(*args, **kwargs) 2025-12-04T13:21:31.5252765Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5252803Z with policy(): 2025-12-04T13:21:31.5252956Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5252997Z raise RuntimeError(msg) 2025-12-04T13:21:31.5253349Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280. 2025-12-04T13:21:31.5253352Z 2025-12-04T13:21:31.5253428Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5253649Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5253651Z 2025-12-04T13:21:31.5253739Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5253743Z 2025-12-04T13:21:31.5253744Z 2025-12-04T13:21:31.5253822Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5253909Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5254143Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-140229431b9f8263.xml - 2025-12-04T13:21:31.5254213Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5254463Z FAILED [6.9131s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:21:31.5254508Z Traceback (most recent call last): 2025-12-04T13:21:31.5254673Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5254726Z getattr(self, test_name)() 2025-12-04T13:21:31.5254884Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5254919Z fn() 2025-12-04T13:21:31.5255072Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5255112Z method(*args, **kwargs) 2025-12-04T13:21:31.5255266Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5255307Z method(*args, **kwargs) 2025-12-04T13:21:31.5255458Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5255497Z with policy(): 2025-12-04T13:21:31.5255648Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5255690Z raise RuntimeError(msg) 2025-12-04T13:21:31.5256032Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280. 2025-12-04T13:21:31.5256034Z 2025-12-04T13:21:31.5256110Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5256328Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5256332Z 2025-12-04T13:21:31.5256419Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5256482Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5256543Z ======================= 1 failed, 18 deselected in 7.05s ======================= 2025-12-04T13:21:31.5256582Z Got exit code 1 2025-12-04T13:21:31.5256749Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:21:31.5256891Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.5257080Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a6879dc84d5f9c6e.xml 2025-12-04T13:21:31.5257139Z ============================= test session starts ============================== 2025-12-04T13:21:31.5257252Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5257295Z cachedir: .pytest_cache 2025-12-04T13:21:31.5257454Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5257500Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5257540Z configfile: pytest.ini 2025-12-04T13:21:31.5257705Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5257780Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.5257833Z stepcurrent: skipping 18 already run items. 2025-12-04T13:21:31.5257876Z Running 1 items in this shard 2025-12-04T13:21:31.5257878Z 2025-12-04T13:21:31.5258257Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda I1204 13:21:00.407000 565228 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 565297 2025-12-04T13:21:31.5258424Z I1204 13:21:00.408000 565228 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 565298 2025-12-04T13:21:31.5258578Z I1204 13:21:00.408000 565228 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 565299 2025-12-04T13:21:31.5258750Z I1204 13:21:00.409000 565228 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 565300 2025-12-04T13:21:31.5259113Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5259164Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5259659Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5259724Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5260080Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5260129Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5260617Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5260679Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5261032Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5261078Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5261581Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5261641Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5261993Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5262040Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5262542Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5262610Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5262753Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5262917Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5263207Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5263373Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5263661Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5263786Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5264066Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5264214Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5264493Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5264641Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5264918Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5265055Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5265331Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5265490Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5265972Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T13:21:31.5266090Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5266284Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5266646Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5266771Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5266983Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5267159Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5267197Z dist init r=1, world=4 2025-12-04T13:21:31.5267335Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5267503Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5267792Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5267946Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5268280Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5268405Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5268682Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5268830Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5269107Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5269257Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5269534Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5269672Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5269964Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5270112Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5270593Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T13:21:31.5270708Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5270905Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5271274Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5271401Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5271614Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5271792Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5271832Z dist init r=2, world=4 2025-12-04T13:21:31.5271969Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5272131Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5272416Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5272571Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5272856Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5272987Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5273264Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5273411Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5273686Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5273834Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5274122Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5274258Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5274536Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5274684Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5275162Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T13:21:31.5275286Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5275480Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5275853Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5275976Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5276189Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5276356Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5276394Z dist init r=3, world=4 2025-12-04T13:21:31.5276533Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5276691Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5276978Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5277132Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5277417Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5277541Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5277818Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5277965Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5278286Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5278434Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5278710Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5278848Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5279124Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5279272Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5279763Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T13:21:31.5279888Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5280083Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5280455Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5280568Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5280779Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5280944Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5280981Z dist init r=0, world=4 2025-12-04T13:21:31.5281019Z FAILED [7.5117s] [100%] 2025-12-04T13:21:31.5281021Z 2025-12-04T13:21:31.5281079Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5281178Z __ TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda ___ 2025-12-04T13:21:31.5281224Z Traceback (most recent call last): 2025-12-04T13:21:31.5281385Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5281429Z self._join_processes(fn) 2025-12-04T13:21:31.5281602Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5281657Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5281833Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5281877Z raise RuntimeError(error) 2025-12-04T13:21:31.5281958Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.5282005Z Traceback (most recent call last): 2025-12-04T13:21:31.5282165Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5282208Z getattr(self, test_name)() 2025-12-04T13:21:31.5282374Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5282409Z fn() 2025-12-04T13:21:31.5282561Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5282602Z method(*args, **kwargs) 2025-12-04T13:21:31.5282752Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5282791Z method(*args, **kwargs) 2025-12-04T13:21:31.5282942Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5282978Z with policy(): 2025-12-04T13:21:31.5283130Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5283170Z raise RuntimeError(msg) 2025-12-04T13:21:31.5283534Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T13:21:31.5283546Z 2025-12-04T13:21:31.5283622Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5283854Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5283856Z 2025-12-04T13:21:31.5283955Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5283958Z 2025-12-04T13:21:31.5283959Z 2025-12-04T13:21:31.5284035Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5284123Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5284359Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a6879dc84d5f9c6e.xml - 2025-12-04T13:21:31.5284418Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5284667Z FAILED [7.5117s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:21:31.5284713Z Traceback (most recent call last): 2025-12-04T13:21:31.5284875Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5284919Z getattr(self, test_name)() 2025-12-04T13:21:31.5285078Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5285113Z fn() 2025-12-04T13:21:31.5285263Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5285304Z method(*args, **kwargs) 2025-12-04T13:21:31.5285454Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5285495Z method(*args, **kwargs) 2025-12-04T13:21:31.5285644Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5285681Z with policy(): 2025-12-04T13:21:31.5285831Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5285872Z raise RuntimeError(msg) 2025-12-04T13:21:31.5286233Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T13:21:31.5286236Z 2025-12-04T13:21:31.5286312Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5286543Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5286545Z 2025-12-04T13:21:31.5286632Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5286695Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5286756Z ======================= 1 failed, 18 deselected in 7.65s ======================= 2025-12-04T13:21:31.5286792Z Got exit code 1 2025-12-04T13:21:31.5286832Z Retrying single test... 2025-12-04T13:21:31.5287022Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2f6c6c23e79f8289.xml 2025-12-04T13:21:31.5287089Z ============================= test session starts ============================== 2025-12-04T13:21:31.5287202Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5287254Z cachedir: .pytest_cache 2025-12-04T13:21:31.5287413Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5287459Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5287498Z configfile: pytest.ini 2025-12-04T13:21:31.5287662Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5287747Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.5287972Z stepcurrent: skipping 18 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5288017Z Running 1 items in this shard 2025-12-04T13:21:31.5288019Z 2025-12-04T13:21:31.5288355Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda I1204 13:21:10.483000 565606 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 565675 2025-12-04T13:21:31.5288510Z I1204 13:21:10.484000 565606 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 565676 2025-12-04T13:21:31.5288662Z I1204 13:21:10.484000 565606 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 565677 2025-12-04T13:21:31.5288813Z I1204 13:21:10.485000 565606 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 565678 2025-12-04T13:21:31.5289175Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5289223Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5289717Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5289780Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5290153Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5290201Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5290689Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5290750Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5291101Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5291148Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5291519Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5291576Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5292062Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5292133Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5292692Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5292751Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5292896Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5293058Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5293348Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5293504Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5293794Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5293920Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5294197Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5294346Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5294634Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5294781Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5295061Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5295197Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5295475Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5295624Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5296115Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T13:21:31.5296243Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5296437Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5296808Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5296923Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5297134Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5297299Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5297339Z dist init r=2, world=4 2025-12-04T13:21:31.5297477Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5297638Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5297926Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5298079Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5298430Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5298555Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5298831Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5298993Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5299269Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5299418Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5299695Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5299832Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5300121Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5300271Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5300762Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T13:21:31.5300890Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5301086Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5301445Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5301560Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5301770Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5301936Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5301976Z dist init r=3, world=4 2025-12-04T13:21:31.5302114Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5302273Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5302559Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5302712Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5303045Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5303184Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5303461Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5303609Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5303884Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5304031Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5304309Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5304456Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5304742Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5304890Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5305378Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T13:21:31.5305494Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5305687Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5306045Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5306159Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5306370Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5306536Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5306574Z dist init r=0, world=4 2025-12-04T13:21:31.5306714Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5306873Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5307162Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5307315Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5307614Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5307739Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5308014Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5308199Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5308476Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5308635Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5308911Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5309061Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5309337Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5309499Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5309975Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T13:21:31.5310090Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5310284Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5310640Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5310754Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5310965Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5311130Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5311168Z dist init r=1, world=4 2025-12-04T13:21:31.5311206Z FAILED [7.5141s] [100%] 2025-12-04T13:21:31.5311208Z 2025-12-04T13:21:31.5311266Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5311366Z __ TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda ___ 2025-12-04T13:21:31.5311412Z Traceback (most recent call last): 2025-12-04T13:21:31.5311585Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5311629Z self._join_processes(fn) 2025-12-04T13:21:31.5311802Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5311857Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5312033Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5312077Z raise RuntimeError(error) 2025-12-04T13:21:31.5312157Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.5312203Z Traceback (most recent call last): 2025-12-04T13:21:31.5312364Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5312407Z getattr(self, test_name)() 2025-12-04T13:21:31.5312565Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5312600Z fn() 2025-12-04T13:21:31.5312763Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5312813Z method(*args, **kwargs) 2025-12-04T13:21:31.5312963Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5313003Z method(*args, **kwargs) 2025-12-04T13:21:31.5313153Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5313201Z with policy(): 2025-12-04T13:21:31.5313353Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5313393Z raise RuntimeError(msg) 2025-12-04T13:21:31.5313747Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T13:21:31.5313750Z 2025-12-04T13:21:31.5313825Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5314057Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5314059Z 2025-12-04T13:21:31.5314147Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5314150Z 2025-12-04T13:21:31.5314209Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.5314254Z Traceback (most recent call last): 2025-12-04T13:21:31.5314418Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5314459Z getattr(self, test_name)() 2025-12-04T13:21:31.5314618Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5314652Z fn() 2025-12-04T13:21:31.5314803Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5314842Z method(*args, **kwargs) 2025-12-04T13:21:31.5314993Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5315034Z method(*args, **kwargs) 2025-12-04T13:21:31.5315183Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5315221Z with policy(): 2025-12-04T13:21:31.5315381Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5315422Z raise RuntimeError(msg) 2025-12-04T13:21:31.5315776Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T13:21:31.5315779Z 2025-12-04T13:21:31.5315852Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5316082Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5316085Z 2025-12-04T13:21:31.5316173Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5316175Z 2025-12-04T13:21:31.5316177Z 2025-12-04T13:21:31.5316253Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5316351Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5316585Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2f6c6c23e79f8289.xml - 2025-12-04T13:21:31.5316654Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5316903Z FAILED [7.5141s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:21:31.5316959Z Traceback (most recent call last): 2025-12-04T13:21:31.5317122Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5317163Z getattr(self, test_name)() 2025-12-04T13:21:31.5317323Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5317356Z fn() 2025-12-04T13:21:31.5317509Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5317549Z method(*args, **kwargs) 2025-12-04T13:21:31.5317699Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5317737Z method(*args, **kwargs) 2025-12-04T13:21:31.5317887Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5317924Z with policy(): 2025-12-04T13:21:31.5318076Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5318116Z raise RuntimeError(msg) 2025-12-04T13:21:31.5318514Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T13:21:31.5318517Z 2025-12-04T13:21:31.5318591Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5318820Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5318823Z 2025-12-04T13:21:31.5318911Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5318913Z 2025-12-04T13:21:31.5318970Z Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.5319016Z Traceback (most recent call last): 2025-12-04T13:21:31.5319193Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5319235Z getattr(self, test_name)() 2025-12-04T13:21:31.5319395Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5319429Z fn() 2025-12-04T13:21:31.5319579Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5319619Z method(*args, **kwargs) 2025-12-04T13:21:31.5319768Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5319808Z method(*args, **kwargs) 2025-12-04T13:21:31.5319957Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5319994Z with policy(): 2025-12-04T13:21:31.5320146Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5320187Z raise RuntimeError(msg) 2025-12-04T13:21:31.5320554Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T13:21:31.5320568Z 2025-12-04T13:21:31.5320640Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5320868Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5320893Z 2025-12-04T13:21:31.5320979Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5321044Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5321105Z ======================= 1 failed, 18 deselected in 7.65s ======================= 2025-12-04T13:21:31.5321144Z Got exit code 1 2025-12-04T13:21:31.5321183Z Retrying single test... 2025-12-04T13:21:31.5321374Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4d126ec424ab47b8.xml 2025-12-04T13:21:31.5321431Z ============================= test session starts ============================== 2025-12-04T13:21:31.5321543Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5321583Z cachedir: .pytest_cache 2025-12-04T13:21:31.5321743Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5321789Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5321829Z configfile: pytest.ini 2025-12-04T13:21:31.5321992Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5322067Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:21:31.5322291Z stepcurrent: skipping 18 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5322335Z Running 1 items in this shard 2025-12-04T13:21:31.5322337Z 2025-12-04T13:21:31.5322646Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda I1204 13:21:20.916000 565984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 566053 2025-12-04T13:21:31.5322802Z I1204 13:21:20.917000 565984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 566054 2025-12-04T13:21:31.5322964Z I1204 13:21:20.917000 565984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 566055 2025-12-04T13:21:31.5323115Z I1204 13:21:20.918000 565984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 566056 2025-12-04T13:21:31.5323474Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5323523Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5324016Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5324080Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5324442Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5324500Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5324989Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5325061Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5325415Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5325461Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5325948Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5326007Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5326359Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:21:31.5326404Z self.encoder = TransformerEncoder( 2025-12-04T13:21:31.5326891Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:21:31.5326951Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:21:31.5327095Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5327257Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5327559Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5327715Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5328000Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5328125Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5328447Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5328596Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5328889Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5329048Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5329322Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5329471Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5329751Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5329901Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5330386Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T13:21:31.5330503Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5330699Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5331061Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5331176Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5331388Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5331554Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:21:31.5331591Z dist init r=3, world=4 2025-12-04T13:21:31.5331743Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5331902Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5332192Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5332345Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5332630Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5332755Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5333041Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5333198Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5333473Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5333633Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5333908Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5334046Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5334322Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5334473Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5334953Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112. 2025-12-04T13:21:31.5335068Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5335265Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5335621Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5335736Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5335966Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5336133Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:21:31.5336170Z dist init r=2, world=4 2025-12-04T13:21:31.5336309Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5336469Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5336756Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5336912Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5337207Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5337349Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5337623Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5337771Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5338059Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5338247Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5338522Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5338658Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5338935Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5339086Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5339567Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208. 2025-12-04T13:21:31.5339682Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5339877Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5340247Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5340360Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5340572Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5340735Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:21:31.5340774Z dist init r=0, world=4 2025-12-04T13:21:31.5340910Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:21:31.5341071Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:21:31.5341358Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5341523Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:21:31.5341822Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5341944Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:21:31.5342234Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5342382Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5342659Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5342806Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:21:31.5343080Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5343217Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:21:31.5343494Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5343643Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:21:31.5344121Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328. 2025-12-04T13:21:31.5344236Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5344441Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5344797Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5344911Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:21:31.5345122Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5345288Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:21:31.5345325Z dist init r=1, world=4 2025-12-04T13:21:31.5345364Z FAILED [7.4123s] [100%] 2025-12-04T13:21:31.5345367Z 2025-12-04T13:21:31.5345424Z =================================== FAILURES =================================== 2025-12-04T13:21:31.5345533Z __ TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda ___ 2025-12-04T13:21:31.5345579Z Traceback (most recent call last): 2025-12-04T13:21:31.5345751Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:21:31.5345796Z self._join_processes(fn) 2025-12-04T13:21:31.5345967Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:21:31.5346021Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:21:31.5346209Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:21:31.5346253Z raise RuntimeError(error) 2025-12-04T13:21:31.5346333Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.5346379Z Traceback (most recent call last): 2025-12-04T13:21:31.5346540Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5346584Z getattr(self, test_name)() 2025-12-04T13:21:31.5346742Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5346776Z fn() 2025-12-04T13:21:31.5346926Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5346966Z method(*args, **kwargs) 2025-12-04T13:21:31.5347119Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5347158Z method(*args, **kwargs) 2025-12-04T13:21:31.5347309Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5347346Z with policy(): 2025-12-04T13:21:31.5347498Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5347540Z raise RuntimeError(msg) 2025-12-04T13:21:31.5347891Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T13:21:31.5347895Z 2025-12-04T13:21:31.5347970Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5348239Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5348241Z 2025-12-04T13:21:31.5348342Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5348345Z 2025-12-04T13:21:31.5348347Z 2025-12-04T13:21:31.5348424Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:21:31.5348512Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:21:31.5348746Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4d126ec424ab47b8.xml - 2025-12-04T13:21:31.5348805Z =========================== short test summary info ============================ 2025-12-04T13:21:31.5349053Z FAILED [7.4123s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:21:31.5349100Z Traceback (most recent call last): 2025-12-04T13:21:31.5349265Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:21:31.5349306Z getattr(self, test_name)() 2025-12-04T13:21:31.5349478Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:21:31.5349524Z fn() 2025-12-04T13:21:31.5349675Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5349715Z method(*args, **kwargs) 2025-12-04T13:21:31.5349866Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:21:31.5349917Z method(*args, **kwargs) 2025-12-04T13:21:31.5350066Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:21:31.5350103Z with policy(): 2025-12-04T13:21:31.5350254Z File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:21:31.5350295Z raise RuntimeError(msg) 2025-12-04T13:21:31.5350651Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464. 2025-12-04T13:21:31.5350654Z 2025-12-04T13:21:31.5350729Z To execute this test, run the following from the base repo dir: 2025-12-04T13:21:31.5350960Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5350963Z 2025-12-04T13:21:31.5351050Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:21:31.5351113Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:21:31.5351175Z ======================= 1 failed, 18 deselected in 7.55s ======================= 2025-12-04T13:21:31.5351213Z Got exit code 1 2025-12-04T13:21:31.5351391Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda 2025-12-04T13:21:31.5351521Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:21:31.5351709Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-83a646cba36145fb.xml 2025-12-04T13:21:31.5351767Z ============================= test session starts ============================== 2025-12-04T13:21:31.5351879Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python 2025-12-04T13:21:31.5351920Z cachedir: .pytest_cache 2025-12-04T13:21:31.5352086Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:21:31.5352133Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:21:31.5352174Z configfile: pytest.ini 2025-12-04T13:21:31.5352337Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:21:31.5352412Z collecting ... collected 60 items / 19 deselected / 41 selected 2025-12-04T13:21:31.5352465Z stepcurrent: skipping 19 already run items. 2025-12-04T13:21:31.5352509Z Running 0 items in this shard 2025-12-04T13:21:31.5352511Z 2025-12-04T13:21:31.5352743Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-83a646cba36145fb.xml - 2025-12-04T13:21:31.5352802Z ============================ 19 deselected in 0.01s ============================ 2025-12-04T13:21:31.5356005Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda'] 2025-12-04T13:21:31.5356030Z 2025-12-04T13:21:31.5356216Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_core 1/3 (test/test-reports/distributed.fsdp.test_fsdp_core_1.3_b5bdac945a318f3b_.log) 2025-12-04T13:21:31.5356218Z 2025-12-04T13:21:31.5356341Z Finished distributed/fsdp/test_fsdp_core 1/3 ... [2025-12-04 13:21:31.306051][2293989.955230869], took 23.36min 2025-12-04T13:21:31.5356604Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T13:21:31.5356690Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:21:31.5356795Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:21:31.5356843Z Uploading artifacts took 0.00 seconds 2025-12-04T13:21:31.5356897Z distributed/fsdp/test_fsdp_core 1/3 failed! 2025-12-04T13:21:31.5357006Z Running distributed/test_c10d_spawn_gloo 1/1 ... [2025-12-04 13:21:31.310061][2293989.959244515] 2025-12-04T13:21:31.5357055Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:21:31.5357386Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_spawn_gloo.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:21:31.310253] 2025-12-04T13:22:30.7624246Z 2025-12-04T13:22:30.7625268Z distributed/test_c10d_spawn_gloo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_spawn_gloo_1.1_16b0e09937d5ce50_.log 2025-12-04T13:22:30.7629877Z Running 11 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cpu, test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cuda, test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_rnn, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_gather, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all_single, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_allreduce, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_broadcast, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_gather, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_reduce, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_scatter 2025-12-04T13:22:30.7631760Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cpu 2025-12-04T13:22:30.7632123Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cuda 2025-12-04T13:22:30.7632479Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_rnn 2025-12-04T13:22:30.7632822Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_gather 2025-12-04T13:22:30.7633152Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all 2025-12-04T13:22:30.7633493Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all_single 2025-12-04T13:22:30.7633833Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_allreduce 2025-12-04T13:22:30.7634163Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_broadcast 2025-12-04T13:22:30.7634486Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_gather 2025-12-04T13:22:30.7634804Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_reduce 2025-12-04T13:22:30.7635123Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_scatter 2025-12-04T13:22:30.7635302Z 2025-12-04T13:22:30.7635425Z Finished distributed/test_c10d_spawn_gloo 1/1 ... [2025-12-04 13:22:30.762114][2294049.4112939], took 0.99min 2025-12-04T13:22:30.7639412Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T13:22:30.7655447Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:22:30.7658932Z Running distributed/test_c10d_spawn_ucc 1/1 ... [2025-12-04 13:22:30.765755][2294049.414937132] 2025-12-04T13:22:30.7659132Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:22:30.7660622Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_spawn_ucc.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:22:30.765958] 2025-12-04T13:22:44.7105997Z 2025-12-04T13:22:44.7106850Z distributed/test_c10d_spawn_ucc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_spawn_ucc_1.1_2d2963b015177e4d_.log 2025-12-04T13:22:44.7109248Z Running 6 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_gather, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all_single, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_allreduce, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_broadcast, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_reduce 2025-12-04T13:22:44.7111001Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_gather 2025-12-04T13:22:44.7111575Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all 2025-12-04T13:22:44.7112216Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all_single 2025-12-04T13:22:44.7112826Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_allreduce 2025-12-04T13:22:44.7113388Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_broadcast 2025-12-04T13:22:44.7113939Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_reduce 2025-12-04T13:22:44.7114245Z 2025-12-04T13:22:44.7114466Z Finished distributed/test_c10d_spawn_ucc 1/1 ... [2025-12-04 13:22:44.710382][2294063.359561743], took 0.23min 2025-12-04T13:22:44.7121927Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T13:22:44.7137968Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:22:44.7141345Z Running distributed/test_c10d_gloo 1/2 ... [2025-12-04 13:22:44.713984][2294063.363167496] 2025-12-04T13:22:44.7141589Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:22:44.7142852Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_gloo.py', '--shard-id=1', '--num-shards=2', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:22:44.714164] 2025-12-04T13:32:54.3360990Z 2025-12-04T13:32:54.3362303Z distributed/test_c10d_gloo 1/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_gloo_1.2_a9670515dc54cf51_.log 2025-12-04T13:32:54.3387479Z Running 127 items in this shard: test/distributed/test_c10d_gloo.py::RendezvousTCPTest::test_tcp_init, test/distributed/test_c10d_gloo.py::TimeoutTest::test_default_store_timeout_gloo, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_into_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_checks_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_overall_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_barrier_implies_wait, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_empty_tensors, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_multi_device_constructor, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_send_recv_all_to_all, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_set_gloo_pg_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_dataclass_output, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_cpu, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_gloo, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_sparse_gradients, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad_with_grad_is_view, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module_grad_is_view, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_output, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_sharded_tensor, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients_grad_is_view, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sync_batch_norm_empty_input, test/distributed/test_c10d_gloo.py::ReducerTest::test_forward_backward, test/distributed/test_c10d_gloo.py::ReducerTest::test_multi_dtype_single_bucket, test/distributed/test_c10d_gloo.py::ReducerTest::test_single_dtype_single_bucket, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_inference_mode, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_into_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_op_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_overall_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_block_current_stream_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_empty_tensors, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_noncontiguous_input, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_send_recv_all_to_all, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_set_gloo_pg_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_inference_mode, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_op_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_overall_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_block_current_stream_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_all_to_all, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_complex, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_set_gloo_pg_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_short_json, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_checks, test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cpu, test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cuda, test/distributed/test_c10d_gloo.py::CommTest::test_gloo_rank_membership, test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_default_pg_gloo, test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_gloo_new_group, test/distributed/test_c10d_gloo.py::CommTest::test_tensor_dtype_complex, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_allgather_coalesced, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_init_process_group_for_all_backends, test/distributed/test_c10d_gloo.py::LargeCommTest::test_new_group_local_sync_duplicate_pg 2025-12-04T13:32:54.3403778Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::RendezvousTCPTest::test_tcp_init 2025-12-04T13:32:54.3404074Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::TimeoutTest::test_default_store_timeout_gloo 2025-12-04T13:32:54.3404392Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_async 2025-12-04T13:32:54.3404720Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_checks 2025-12-04T13:32:54.3405049Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_into_tensor_coalesced 2025-12-04T13:32:54.3405372Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress 2025-12-04T13:32:54.3405675Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress_cuda 2025-12-04T13:32:54.3405977Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics 2025-12-04T13:32:54.3406367Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics_cuda 2025-12-04T13:32:54.3406667Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_checks 2025-12-04T13:32:54.3406972Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_basics 2025-12-04T13:32:54.3407340Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_checks_cuda 2025-12-04T13:32:54.3407666Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_overall_timeout 2025-12-04T13:32:54.3407968Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_stress 2025-12-04T13:32:54.3408302Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_barrier_implies_wait 2025-12-04T13:32:54.3408603Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics 2025-12-04T13:32:54.3408923Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics_cuda 2025-12-04T13:32:54.3409240Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress 2025-12-04T13:32:54.3409538Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress_cuda 2025-12-04T13:32:54.3409836Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_empty_tensors 2025-12-04T13:32:54.3410147Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_stress_cuda 2025-12-04T13:32:54.3410453Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_multi_device_constructor 2025-12-04T13:32:54.3410758Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_basics 2025-12-04T13:32:54.3411046Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress 2025-12-04T13:32:54.3411339Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress_cuda 2025-12-04T13:32:54.3411640Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics 2025-12-04T13:32:54.3411933Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics_cuda 2025-12-04T13:32:54.3412229Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_stress 2025-12-04T13:32:54.3412526Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_send_recv_all_to_all 2025-12-04T13:32:54.3412828Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_set_gloo_pg_timeout 2025-12-04T13:32:54.3413137Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics 2025-12-04T13:32:54.3413460Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics_cuda 2025-12-04T13:32:54.3413784Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_dataclass_output 2025-12-04T13:32:54.3414124Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module 2025-12-04T13:32:54.3414503Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_weight_sharing 2025-12-04T13:32:54.3414896Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False 2025-12-04T13:32:54.3415314Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_True 2025-12-04T13:32:54.3415723Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_static_graph_use_reentrant_True 2025-12-04T13:32:54.3416134Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_False 2025-12-04T13:32:54.3416526Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_True 2025-12-04T13:32:54.3416929Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_True 2025-12-04T13:32:54.3417346Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_True 2025-12-04T13:32:54.3417750Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_cpu 2025-12-04T13:32:54.3418122Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_gloo 2025-12-04T13:32:54.3418541Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_sparse_gradients 2025-12-04T13:32:54.3418904Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad 2025-12-04T13:32:54.3419302Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad_with_grad_is_view 2025-12-04T13:32:54.3419681Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module 2025-12-04T13:32:54.3420037Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module_grad_is_view 2025-12-04T13:32:54.3420385Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_output 2025-12-04T13:32:54.3420713Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_sharded_tensor 2025-12-04T13:32:54.3421037Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients 2025-12-04T13:32:54.3421372Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients_grad_is_view 2025-12-04T13:32:54.3421726Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sync_batch_norm_empty_input 2025-12-04T13:32:54.3422041Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_forward_backward 2025-12-04T13:32:54.3422323Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_multi_dtype_single_bucket 2025-12-04T13:32:54.3422617Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_single_dtype_single_bucket 2025-12-04T13:32:54.3422925Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_checks 2025-12-04T13:32:54.3423260Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_coalesced_checks 2025-12-04T13:32:54.3423607Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_inference_mode 2025-12-04T13:32:54.3423960Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_into_tensor_coalesced 2025-12-04T13:32:54.3424317Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_stress 2025-12-04T13:32:54.3424647Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_basics_cuda 2025-12-04T13:32:54.3424974Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_checks 2025-12-04T13:32:54.3425310Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_async 2025-12-04T13:32:54.3425659Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks 2025-12-04T13:32:54.3426017Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks_cuda 2025-12-04T13:32:54.3426376Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_stress 2025-12-04T13:32:54.3426721Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_op_timeout 2025-12-04T13:32:54.3427079Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_overall_timeout 2025-12-04T13:32:54.3427429Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress 2025-12-04T13:32:54.3427754Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress_cuda 2025-12-04T13:32:54.3428097Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_block_current_stream_cuda 2025-12-04T13:32:54.3428474Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_basics 2025-12-04T13:32:54.3428793Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress 2025-12-04T13:32:54.3429119Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress_cuda 2025-12-04T13:32:54.3429446Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_empty_tensors 2025-12-04T13:32:54.3429760Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_basics 2025-12-04T13:32:54.3430070Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_checks 2025-12-04T13:32:54.3430402Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_noncontiguous_input 2025-12-04T13:32:54.3430741Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_stress_cuda 2025-12-04T13:32:54.3431072Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor 2025-12-04T13:32:54.3431425Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor_coalesced 2025-12-04T13:32:54.3431762Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_stress 2025-12-04T13:32:54.3432076Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_basics 2025-12-04T13:32:54.3432392Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_checks 2025-12-04T13:32:54.3432714Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_send_recv_all_to_all 2025-12-04T13:32:54.3433046Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_set_gloo_pg_timeout 2025-12-04T13:32:54.3433397Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_basics 2025-12-04T13:32:54.3433720Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics 2025-12-04T13:32:54.3434027Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics_cuda 2025-12-04T13:32:54.3434345Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_async 2025-12-04T13:32:54.3434671Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_checks 2025-12-04T13:32:54.3434993Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_inference_mode 2025-12-04T13:32:54.3435303Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress 2025-12-04T13:32:54.3435610Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress_cuda 2025-12-04T13:32:54.3435942Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_basics_cuda 2025-12-04T13:32:54.3436264Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_stress 2025-12-04T13:32:54.3436598Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_op_timeout 2025-12-04T13:32:54.3436916Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_overall_timeout 2025-12-04T13:32:54.3437253Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_stress 2025-12-04T13:32:54.3437566Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_block_current_stream_cuda 2025-12-04T13:32:54.3437877Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress 2025-12-04T13:32:54.3438216Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress_cuda 2025-12-04T13:32:54.3438527Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_basics_cuda 2025-12-04T13:32:54.3438829Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_checks 2025-12-04T13:32:54.3439128Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_stress_cuda 2025-12-04T13:32:54.3439428Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_checks 2025-12-04T13:32:54.3439725Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter 2025-12-04T13:32:54.3440046Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter_tensor_coalesced 2025-12-04T13:32:54.3440377Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_basics_cuda 2025-12-04T13:32:54.3440688Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_stress_cuda 2025-12-04T13:32:54.3441001Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_all_to_all 2025-12-04T13:32:54.3441313Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_complex 2025-12-04T13:32:54.3441617Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_set_gloo_pg_timeout 2025-12-04T13:32:54.3441916Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_short_json 2025-12-04T13:32:54.3442253Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_basics_cuda 2025-12-04T13:32:54.3442583Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_checks 2025-12-04T13:32:54.3442893Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cpu 2025-12-04T13:32:54.3443184Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cuda 2025-12-04T13:32:54.3443467Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_gloo_rank_membership 2025-12-04T13:32:54.3443756Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_default_pg_gloo 2025-12-04T13:32:54.3444056Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_gloo_new_group 2025-12-04T13:32:54.3444344Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_tensor_dtype_complex 2025-12-04T13:32:54.3444701Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_allgather_coalesced 2025-12-04T13:32:54.3445129Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_init_process_group_for_all_backends 2025-12-04T13:32:54.3445533Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::LargeCommTest::test_new_group_local_sync_duplicate_pg 2025-12-04T13:32:54.3445714Z 2025-12-04T13:32:54.3445837Z Finished distributed/test_c10d_gloo 1/2 ... [2025-12-04 13:32:54.337459][2294672.98663913], took 10.16min 2025-12-04T13:32:54.3446277Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T13:32:54.3446677Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:32:54.3446925Z Running distributed/fsdp/test_fsdp_mixed_precision 1/1 ... [2025-12-04 13:32:54.341168][2294672.990352042] 2025-12-04T13:32:54.3447136Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:32:54.3447552Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_mixed_precision.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:32:54.341355] 2025-12-04T13:39:22.5545288Z 2025-12-04T13:39:22.5546515Z distributed/fsdp/test_fsdp_mixed_precision 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_mixed_precision_1.1_2515bba5fc6f1639_.log 2025-12-04T13:39:22.5571628Z Running 66 items in this shard: test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_buffer_dtype_no_root_handle, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_eval_root_cast_inputs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_full_precision_in_eval, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_full_precision_in_eval_buffers, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_full_precision_in_eval_comm, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_grads_reduced_precision, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_input_grads_with_param_mixed_precision, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_no_reshard_after_forward, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_resnet, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_batchnorm_convert_sync_bn_False, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_batchnorm_convert_sync_bn_True, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_default, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_only_params_and_bufs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_params_and_reduce_diff, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_reduce, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionUnsharded::test_grads_reduced_precision, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionUnsharded::test_mixed_precision_e2e_full_shard, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionUnsharded::test_mixed_precision_no_reshard_after_forward, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionIgnoredModules::test_mixed_precision_with_ignored_module, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_float16_on_one_submodule, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_float16_on_one_submodule_skip_inputs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_float16_on_one_submodule_skip_inputs_error, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_different_precisions, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_different_precisions_error, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_external_inputs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPTrainEval::test_train_ema_eval_flow 2025-12-04T13:39:22.5586187Z 2025-12-04T13:39:22.5586333Z Finished distributed/fsdp/test_fsdp_mixed_precision 1/1 ... [2025-12-04 13:39:22.554498][2295061.203679365], took 6.47min 2025-12-04T13:39:22.5586781Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T13:39:22.5587169Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:39:22.5587392Z Running distributed/test_c10d_nccl 2/3 ... [2025-12-04 13:39:22.558020][2295061.20720422] 2025-12-04T13:39:22.5587608Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:39:22.5588019Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_nccl.py', '--shard-id=2', '--num-shards=3', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:39:22.558192] 2025-12-04T13:49:13.6560390Z 2025-12-04T13:49:13.6561016Z distributed/test_c10d_nccl 2/3 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_nccl_2.3_ef0a5ca71e33a7d5_.log 2025-12-04T13:49:13.6574082Z Running 83 items in this shard: test/distributed/test_c10d_nccl.py::TimeoutTest::test_default_store_timeout_nccl, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLInitTest::test_scalable_init, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_abort_in_destroy_pg, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_comm_split_subgroup, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_cuda_event_cache_mthd_race, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_destruct_before_terminate_pg, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_deterministic_mode_no_break, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_extend_nccl_pg_timeout_backend0, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float16, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float64, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float8_e4m3fn, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_check, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_rank_filter, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_new_group_eager_init_False, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_p2p, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_nccl_pg_timeout_backend0, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_process_group_desc, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_basic, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_multiple_iterations, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_subgroup_p2p_eager_init_True, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_accumulate_gradients_module_with_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_arbitrary_forward_return_value, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_bf16_compress_wrapper_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_builtin_ddp_comm_hooks_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_static_graph, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_multi_device_module_config, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_weight_sharing, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_default_ddp_comm_hooks_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_find_unused_parameters_kwarg_debug_detail, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_grad_layout_2devicemodule, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_invalid_powerSGD_state, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_multiple_outputs_multiple_backward, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_integer_list, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_torch_device_list, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_multi_device_module_device_ids_None, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_pass_default_pg, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_powerSGD_ddp_comm_hook_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_mixed_ops, test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_nonblocking, test/distributed/test_c10d_nccl.py::NcclUserBufferRegistrationTest::test_nccl_window_registration, test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_manager_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_broadcast_coalesced_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier_device_ids, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_warn_not_in_group_debug_off, test/distributed/test_c10d_nccl.py::CommTest::test_nncl_rank_membership, test/distributed/test_c10d_nccl.py::CommTest::test_pass_nccl_options_high_priority_stream, test/distributed/test_c10d_nccl.py::CommTest::test_reduce_scatter_base_k, test/distributed/test_c10d_nccl.py::CommTest::test_unwaited, test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_collectives, test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_default_process_group, test/distributed/test_c10d_nccl.py::LargeCommTest::test_batch_send_recv_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device0_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_object_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync, test/distributed/test_c10d_nccl.py::LargeCommTest::test_scatter_object_list_subgroup_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device0_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device1_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_subgroup_group_rank_True_async_op_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce0_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_multiple_resets_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_circular_buffer_full_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_wraparound_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_wraparound_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_False_include_collectives_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_True_include_collectives_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_pickle_timing_enabled_False_include_collectives_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_active_timing_enabled_True_only_active_False, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLLargerScaleTest::test_comm_recursive_split_group 2025-12-04T13:49:13.6585534Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::TimeoutTest::test_default_store_timeout_nccl 2025-12-04T13:49:13.6585845Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLInitTest::test_scalable_init 2025-12-04T13:49:13.6586168Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_abort_in_destroy_pg 2025-12-04T13:49:13.6586499Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_comm_split_subgroup 2025-12-04T13:49:13.6586835Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_cuda_event_cache_mthd_race 2025-12-04T13:49:13.6587189Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_destruct_before_terminate_pg 2025-12-04T13:49:13.6587542Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_deterministic_mode_no_break 2025-12-04T13:49:13.6587897Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_extend_nccl_pg_timeout_backend0 2025-12-04T13:49:13.6588354Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float16 2025-12-04T13:49:13.6588677Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float64 2025-12-04T13:49:13.6589012Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float8_e4m3fn 2025-12-04T13:49:13.6589330Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_check 2025-12-04T13:49:13.6589634Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_rank_filter 2025-12-04T13:49:13.6589962Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_new_group_eager_init_False 2025-12-04T13:49:13.6590293Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_p2p 2025-12-04T13:49:13.6590661Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_nccl_pg_timeout_backend0 2025-12-04T13:49:13.6591003Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_process_group_desc 2025-12-04T13:49:13.6591329Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_basic 2025-12-04T13:49:13.6591673Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_multiple_iterations 2025-12-04T13:49:13.6592029Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_subgroup_p2p_eager_init_True 2025-12-04T13:49:13.6592408Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_accumulate_gradients_module_with_grad_is_view 2025-12-04T13:49:13.6592796Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_arbitrary_forward_return_value 2025-12-04T13:49:13.6593181Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_bf16_compress_wrapper_nccl 2025-12-04T13:49:13.6593575Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_builtin_ddp_comm_hooks_nccl_grad_is_view 2025-12-04T13:49:13.6593956Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module 2025-12-04T13:49:13.6594340Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False 2025-12-04T13:49:13.6594768Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_False 2025-12-04T13:49:13.6595188Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view 2025-12-04T13:49:13.6595593Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_static_graph 2025-12-04T13:49:13.6595992Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_nccl 2025-12-04T13:49:13.6596371Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_multi_device_module_config 2025-12-04T13:49:13.6596724Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_weight_sharing 2025-12-04T13:49:13.6597067Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_default_ddp_comm_hooks_nccl 2025-12-04T13:49:13.6597458Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_find_unused_parameters_kwarg_debug_detail 2025-12-04T13:49:13.6597830Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_grad_layout_2devicemodule 2025-12-04T13:49:13.6598225Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_invalid_powerSGD_state 2025-12-04T13:49:13.6598589Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_multiple_outputs_multiple_backward 2025-12-04T13:49:13.6598982Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_integer_list 2025-12-04T13:49:13.6599403Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_torch_device_list 2025-12-04T13:49:13.6599845Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_multi_device_module_device_ids_None 2025-12-04T13:49:13.6600231Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_pass_default_pg 2025-12-04T13:49:13.6600588Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_powerSGD_ddp_comm_hook_nccl_grad_is_view 2025-12-04T13:49:13.6600942Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_mixed_ops 2025-12-04T13:49:13.6601257Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_nonblocking 2025-12-04T13:49:13.6601595Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclUserBufferRegistrationTest::test_nccl_window_registration 2025-12-04T13:49:13.6601931Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_manager_nccl 2025-12-04T13:49:13.6602233Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_nccl 2025-12-04T13:49:13.6602553Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_broadcast_coalesced_nccl 2025-12-04T13:49:13.6602829Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier 2025-12-04T13:49:13.6603118Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier_device_ids 2025-12-04T13:49:13.6603411Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_warn_not_in_group_debug_off 2025-12-04T13:49:13.6603703Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nncl_rank_membership 2025-12-04T13:49:13.6604023Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_pass_nccl_options_high_priority_stream 2025-12-04T13:49:13.6604325Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_reduce_scatter_base_k 2025-12-04T13:49:13.6604592Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_unwaited 2025-12-04T13:49:13.6604917Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_collectives 2025-12-04T13:49:13.6605317Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_default_process_group 2025-12-04T13:49:13.6605700Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_batch_send_recv_subgroup_group_rank_False 2025-12-04T13:49:13.6606072Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device0_group_rank_True 2025-12-04T13:49:13.6606437Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_False 2025-12-04T13:49:13.6606763Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_True 2025-12-04T13:49:13.6607094Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_object_subgroup_group_rank_False 2025-12-04T13:49:13.6607422Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_subgroup_group_rank_False 2025-12-04T13:49:13.6607728Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync 2025-12-04T13:49:13.6608048Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_scatter_object_list_subgroup_group_rank_True 2025-12-04T13:49:13.6608476Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device0_group_rank_True 2025-12-04T13:49:13.6608870Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device1_group_rank_True 2025-12-04T13:49:13.6609273Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_subgroup_group_rank_True_async_op_True 2025-12-04T13:49:13.6609677Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce0_timing_enabled_False 2025-12-04T13:49:13.6610079Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_False 2025-12-04T13:49:13.6610472Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_True 2025-12-04T13:49:13.6610848Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_multiple_resets_timing_enabled_True 2025-12-04T13:49:13.6611350Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_circular_buffer_full_timing_enabled_True 2025-12-04T13:49:13.6611709Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_timing_enabled_False 2025-12-04T13:49:13.6612067Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_wraparound_timing_enabled_False 2025-12-04T13:49:13.6612437Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_wraparound_timing_enabled_True 2025-12-04T13:49:13.6612801Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_False 2025-12-04T13:49:13.6613175Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_True 2025-12-04T13:49:13.6613562Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_False_include_collectives_True 2025-12-04T13:49:13.6613943Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_True_include_collectives_False 2025-12-04T13:49:13.6614329Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_pickle_timing_enabled_False_include_collectives_True 2025-12-04T13:49:13.6614712Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_active_timing_enabled_True_only_active_False 2025-12-04T13:49:13.6615088Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLLargerScaleTest::test_comm_recursive_split_group 2025-12-04T13:49:13.6615291Z 2025-12-04T13:49:13.6615408Z Finished distributed/test_c10d_nccl 2/3 ... [2025-12-04 13:49:13.656413][2295652.30559399], took 9.85min 2025-12-04T13:49:13.6615825Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T13:49:13.6616228Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:49:13.6616451Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:49:13.6616634Z Uploading artifacts took 0.00 seconds 2025-12-04T13:49:13.6616834Z Running distributed/elastic/timer/api_test 1/1 ... [2025-12-04 13:49:13.659978][2295652.309162234] 2025-12-04T13:49:13.6617037Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:49:13.6617449Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/elastic/timer/api_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:49:13.660138] 2025-12-04T13:49:14.5999329Z 2025-12-04T13:49:14.6000704Z distributed/elastic/timer/api_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.timer.api_test_1.1_86547a72b69ce307_.log 2025-12-04T13:49:14.6001219Z 2025-12-04T13:49:14.6001451Z Finished distributed/elastic/timer/api_test 1/1 ... [2025-12-04 13:49:14.599542][2295653.248720519], took 0.02min 2025-12-04T13:49:14.6024004Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml 2025-12-04T13:49:14.6040723Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:49:16.8464457Z Running test batch 'tests to run' cost 9398.63 seconds 2025-12-04T13:49:16.8465722Z Emitting td_test_failure_stats_v2 2025-12-04T13:49:16.8469040Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856156_06970a90d11811f0936eb632a3fcafd1 2025-12-04T13:49:18.8626721Z /var/lib/jenkins/pytorch/tools/stats/upload_metrics.py:156: UserWarning: Error uploading metric td_test_failure_stats_v2 to DynamoDB: Unable to locate credentials 2025-12-04T13:49:18.8627784Z warn(f"Error uploading metric {metric_name} to DynamoDB: {e}") 2025-12-04T13:49:18.8628292Z Emitting td_test_failure_stats_v2 2025-12-04T13:49:18.8630588Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07caa458d11811f0936eb632a3fcafd1 2025-12-04T13:49:18.8644906Z Emitting td_test_failure_stats_v2 2025-12-04T13:49:18.8645410Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cae7f6d11811f0936eb632a3fcafd1 2025-12-04T13:49:18.8661428Z Emitting td_test_failure_stats_v2 2025-12-04T13:49:18.8661901Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cb29a0d11811f0936eb632a3fcafd1 2025-12-04T13:49:18.8677859Z Emitting td_test_failure_stats_v2 2025-12-04T13:49:18.8678373Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cb6a6ed11811f0936eb632a3fcafd1 2025-12-04T13:49:18.8693825Z Emitting td_test_failure_stats_v2 2025-12-04T13:49:18.8694272Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cba8e4d11811f0936eb632a3fcafd1 2025-12-04T13:49:18.8710411Z Emitting td_test_failure_stats_v2 2025-12-04T13:49:18.8710840Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cbea0cd11811f0936eb632a3fcafd1 2025-12-04T13:49:18.8726667Z Emitting td_test_failure_stats_v2 2025-12-04T13:49:18.8727222Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cc2a76d11811f0936eb632a3fcafd1 2025-12-04T13:49:18.8742502Z Emitting td_test_failure_stats_v2 2025-12-04T13:49:18.8744690Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cc6914d11811f0936eb632a3fcafd1 2025-12-04T13:49:18.8759040Z distributed/fsdp/test_fsdp_uneven 1/1 failed! 2025-12-04T13:49:18.8759267Z distributed/fsdp/test_fsdp_exec_order 1/1 failed! 2025-12-04T13:49:18.8759475Z distributed/fsdp/test_fsdp_traversal 1/1 failed! 2025-12-04T13:49:18.8759700Z distributed/fsdp/test_fsdp_multiple_wrapping 1/1 failed! 2025-12-04T13:49:18.8759913Z distributed/fsdp/test_fsdp_checkpoint 1/1 failed! 2025-12-04T13:49:18.8760111Z distributed/fsdp/test_fsdp_fine_tune 1/1 failed! 2025-12-04T13:49:18.8760319Z distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 failed! 2025-12-04T13:49:18.8760522Z distributed/fsdp/test_fsdp_comm 1/1 failed! 2025-12-04T13:49:18.8760694Z distributed/fsdp/test_fsdp_core 1/3 failed! 2025-12-04T13:49:19.5289529Z 2025-12-04T13:49:19.5290347Z real 156m44.385s 2025-12-04T13:49:19.5290636Z user 432m13.912s 2025-12-04T13:49:19.5290856Z sys 515m25.252s 2025-12-04T13:49:19.5291021Z + sccache_epilogue 2025-12-04T13:49:19.5291256Z + echo '::group::Sccache Compilation Log' 2025-12-04T13:49:19.5292009Z ##[group]Sccache Compilation Log 2025-12-04T13:49:19.5292798Z + echo '=================== sccache compilation log ===================' 2025-12-04T13:49:19.5293094Z =================== sccache compilation log =================== 2025-12-04T13:49:19.5293524Z + python /var/lib/jenkins/pytorch/.ci/pytorch/print_sccache_log.py /var/lib/jenkins/sccache_error.log 2025-12-04T13:49:19.5370038Z + echo '=========== If your build fails, please take a look at the log above for possible reasons ===========' 2025-12-04T13:49:19.5370435Z =========== If your build fails, please take a look at the log above for possible reasons =========== 2025-12-04T13:49:19.5370734Z + sccache --show-stats 2025-12-04T13:49:19.5391826Z Compile requests 687 2025-12-04T13:49:19.5392036Z Compile requests executed 0 2025-12-04T13:49:19.5392227Z Cache hits 0 2025-12-04T13:49:19.5392398Z Cache misses 0 2025-12-04T13:49:19.5392574Z Cache hits rate - 2025-12-04T13:49:19.5392753Z Cache timeouts 0 2025-12-04T13:49:19.5392934Z Cache read errors 0 2025-12-04T13:49:19.5393112Z Forced recaches 0 2025-12-04T13:49:19.5393282Z Cache write errors 0 2025-12-04T13:49:19.5393547Z Cache errors 0 2025-12-04T13:49:19.5393724Z Compilations 0 2025-12-04T13:49:19.5393960Z Compilation failures 0 2025-12-04T13:49:19.5394140Z Non-cacheable compilations 0 2025-12-04T13:49:19.5394332Z Non-cacheable calls 1 2025-12-04T13:49:19.5394506Z Non-compilation calls 686 2025-12-04T13:49:19.5394695Z Unsupported compiler calls 0 2025-12-04T13:49:19.5394885Z Average cache write 0.000 s 2025-12-04T13:49:19.5395131Z Average compiler 0.000 s 2025-12-04T13:49:19.5395318Z Average cache read hit 0.000 s 2025-12-04T13:49:19.5395518Z Failed distributed compilations 0 2025-12-04T13:49:19.5395645Z 2025-12-04T13:49:19.5395713Z Non-cacheable reasons: 2025-12-04T13:49:19.5395874Z -E 1 2025-12-04T13:49:19.5395987Z 2025-12-04T13:49:19.5396109Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-12-04T13:49:19.5396350Z Use direct/preprocessor mode? yes 2025-12-04T13:49:19.5396543Z Version (client) 0.10.0 2025-12-04T13:49:19.5396729Z Max cache size 10 GiB 2025-12-04T13:49:19.5396907Z + sccache --stop-server 2025-12-04T13:49:19.5402571Z Stopping sccache server... 2025-12-04T13:49:19.5404429Z Compile requests 687 2025-12-04T13:49:19.5404812Z Compile requests executed 0 2025-12-04T13:49:19.5405105Z Cache hits 0 2025-12-04T13:49:19.5405387Z Cache misses 0 2025-12-04T13:49:19.5405676Z Cache hits rate - 2025-12-04T13:49:19.5405957Z Cache timeouts 0 2025-12-04T13:49:19.5406232Z Cache read errors 0 2025-12-04T13:49:19.5406504Z Forced recaches 0 2025-12-04T13:49:19.5406771Z Cache write errors 0 2025-12-04T13:49:19.5407042Z Cache errors 0 2025-12-04T13:49:19.5407316Z Compilations 0 2025-12-04T13:49:19.5407614Z Compilation failures 0 2025-12-04T13:49:19.5407893Z Non-cacheable compilations 0 2025-12-04T13:49:19.5408326Z Non-cacheable calls 1 2025-12-04T13:49:19.5408600Z Non-compilation calls 686 2025-12-04T13:49:19.5408871Z Unsupported compiler calls 0 2025-12-04T13:49:19.5409160Z Average cache write 0.000 s 2025-12-04T13:49:19.5409455Z Average compiler 0.000 s 2025-12-04T13:49:19.5409737Z Average cache read hit 0.000 s 2025-12-04T13:49:19.5410025Z Failed distributed compilations 0 2025-12-04T13:49:19.5410213Z 2025-12-04T13:49:19.5410313Z Non-cacheable reasons: 2025-12-04T13:49:19.5410556Z -E 1 2025-12-04T13:49:19.5410964Z 2025-12-04T13:49:19.5411152Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-12-04T13:49:19.5411533Z Use direct/preprocessor mode? yes 2025-12-04T13:49:19.5411822Z Version (client) 0.10.0 2025-12-04T13:49:19.5412103Z Max cache size 10 GiB 2025-12-04T13:49:19.5412394Z + echo ::endgroup:: 2025-12-04T13:49:19.5412914Z ##[endgroup] 2025-12-04T13:49:19.5464486Z ##[error]Process completed with exit code 1. 2025-12-04T13:49:19.5493172Z ##[group]Run # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-12-04T13:49:19.5504608Z # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-12-04T13:49:19.5505086Z docker exec -t "5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d" sh -c "cd ../pytorch && sudo cp -R test/test-reports ../workspace/test" 2025-12-04T13:49:19.5510177Z shell: /usr/bin/bash -e {0} 2025-12-04T13:49:19.5510299Z env: 2025-12-04T13:49:19.5510413Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:49:19.5510556Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:49:19.5510743Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:49:19.5511020Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:49:19.5511558Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:49:19.5512128Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:49:19.5512247Z AWS_REGION: us-east-1 2025-12-04T13:49:19.5512496Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:49:19.5512648Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:49:19.5514657Z AWS_SESSION_TOKEN: *** 2025-12-04T13:49:19.5514832Z CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T13:49:19.5515025Z ##[endgroup] 2025-12-04T13:49:19.6342435Z ##[group]Run docker exec -t "5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d" sh -c "sudo chown -R 1001:1001 test" 2025-12-04T13:49:19.6342910Z docker exec -t "5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d" sh -c "sudo chown -R 1001:1001 test" 2025-12-04T13:49:19.6347140Z shell: /usr/bin/bash -e {0} 2025-12-04T13:49:19.6347264Z env: 2025-12-04T13:49:19.6347365Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:49:19.6347509Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:49:19.6347693Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:49:19.6347880Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:49:19.6348575Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:49:19.6349077Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:49:19.6349202Z AWS_REGION: us-east-1 2025-12-04T13:49:19.6349392Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:49:19.6349553Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:49:19.6351573Z AWS_SESSION_TOKEN: *** 2025-12-04T13:49:19.6351751Z CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T13:49:19.6351944Z ##[endgroup] 2025-12-04T13:49:19.7207636Z ##[group]Run cat test/**/*_toprint.log || true 2025-12-04T13:49:19.7207813Z cat test/**/*_toprint.log || true 2025-12-04T13:49:19.7210772Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T13:49:19.7210932Z env: 2025-12-04T13:49:19.7211040Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:49:19.7211188Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:49:19.7211379Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:49:19.7211560Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:49:19.7212097Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:49:19.7212627Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:49:19.7212772Z AWS_REGION: us-east-1 2025-12-04T13:49:19.7212921Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:49:19.7213088Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:49:19.7215138Z AWS_SESSION_TOKEN: *** 2025-12-04T13:49:19.7215317Z CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T13:49:19.7215506Z ##[endgroup] 2025-12-04T13:49:19.7259951Z cat: 'test/**/*_toprint.log': No such file or directory 2025-12-04T13:49:19.7328629Z Prepare all required actions 2025-12-04T13:49:19.7329015Z Getting action download info 2025-12-04T13:49:20.1267717Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-12-04T13:49:20.9538045Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-12-04T13:49:21.8795944Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-12-04T13:49:21.8796103Z with: 2025-12-04T13:49:21.8796198Z use-gha: true 2025-12-04T13:49:21.8796362Z file-suffix: test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540 2025-12-04T13:49:21.8796546Z s3-bucket: gha-artifacts 2025-12-04T13:49:21.8796739Z env: 2025-12-04T13:49:21.8796837Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:49:21.8796976Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:49:21.8797159Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:49:21.8797349Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:49:21.8797862Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:49:21.8798422Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:49:21.8798544Z AWS_REGION: us-east-1 2025-12-04T13:49:21.8798707Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:49:21.8798865Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:49:21.8800870Z AWS_SESSION_TOKEN: *** 2025-12-04T13:49:21.8801047Z CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T13:49:21.8801237Z ##[endgroup] 2025-12-04T13:49:21.8830940Z ##[group]Run actions/upload-artifact@v4 2025-12-04T13:49:21.8831075Z with: 2025-12-04T13:49:21.8831275Z name: test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip 2025-12-04T13:49:21.8831493Z retention-days: 14 2025-12-04T13:49:21.8831607Z if-no-files-found: warn 2025-12-04T13:49:21.8831719Z path: test/**/*.json 2025-12-04T13:49:21.8831829Z compression-level: 6 2025-12-04T13:49:21.8831933Z overwrite: false 2025-12-04T13:49:21.8832044Z include-hidden-files: false 2025-12-04T13:49:21.8832155Z env: 2025-12-04T13:49:21.8832248Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:49:21.8832388Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:49:21.8832573Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:49:21.8832742Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:49:21.8833251Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:49:21.8833740Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:49:21.8833863Z AWS_REGION: us-east-1 2025-12-04T13:49:21.8833998Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:49:21.8834151Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:49:21.8836213Z AWS_SESSION_TOKEN: *** 2025-12-04T13:49:21.8836387Z CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T13:49:21.8836572Z ##[endgroup] 2025-12-04T13:49:22.3018588Z With the provided path, there will be 6 files uploaded 2025-12-04T13:49:22.3021164Z Artifact name is valid! 2025-12-04T13:49:22.3022049Z Root directory input is valid! 2025-12-04T13:49:22.5822774Z Beginning upload of artifact content to blob storage 2025-12-04T13:49:22.9570981Z Uploaded bytes 44615 2025-12-04T13:49:23.0253320Z Finished uploading artifact content to blob storage! 2025-12-04T13:49:23.0254560Z SHA256 digest of uploaded artifact zip is 522cfd5f062ae50bd9823d80787cbd4928b98ba8f996043c0a02d5a3c891ba7b 2025-12-04T13:49:23.0255468Z Finalizing artifact upload 2025-12-04T13:49:23.1824938Z Artifact test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip.zip successfully finalized. Artifact ID 4764717137 2025-12-04T13:49:23.1826381Z Artifact test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip has been successfully uploaded! Final size is 44615 bytes. Artifact ID is 4764717137 2025-12-04T13:49:23.1830498Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922798714/artifacts/4764717137 2025-12-04T13:49:23.1955133Z ##[group]Run actions/upload-artifact@v4 2025-12-04T13:49:23.1955291Z with: 2025-12-04T13:49:23.1955501Z name: test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip 2025-12-04T13:49:23.1955836Z retention-days: 14 2025-12-04T13:49:23.1955951Z if-no-files-found: ignore 2025-12-04T13:49:23.1956082Z path: test/**/*.xml test/**/*.csv 2025-12-04T13:49:23.1956212Z compression-level: 6 2025-12-04T13:49:23.1956336Z overwrite: false 2025-12-04T13:49:23.1956452Z include-hidden-files: false 2025-12-04T13:49:23.1956569Z env: 2025-12-04T13:49:23.1956669Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:49:23.1956814Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:49:23.1957012Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:49:23.1957189Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:49:23.1957709Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:49:23.1958391Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:49:23.1958514Z AWS_REGION: us-east-1 2025-12-04T13:49:23.1958698Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:49:23.1958857Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:49:23.1960888Z AWS_SESSION_TOKEN: *** 2025-12-04T13:49:23.1961069Z CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T13:49:23.1961259Z ##[endgroup] 2025-12-04T13:49:23.6035824Z With the provided path, there will be 808 files uploaded 2025-12-04T13:49:23.6039138Z Artifact name is valid! 2025-12-04T13:49:23.6039415Z Root directory input is valid! 2025-12-04T13:49:23.8241770Z Beginning upload of artifact content to blob storage 2025-12-04T13:49:24.5930445Z Uploaded bytes 681492 2025-12-04T13:49:24.6580145Z Finished uploading artifact content to blob storage! 2025-12-04T13:49:24.6582677Z SHA256 digest of uploaded artifact zip is 231eb3f54fc2665f1723cd26e833c8a548e4a409a21546d5f01010862e8d7fa5 2025-12-04T13:49:24.6583301Z Finalizing artifact upload 2025-12-04T13:49:24.8169117Z Artifact test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip.zip successfully finalized. Artifact ID 4764717455 2025-12-04T13:49:24.8170662Z Artifact test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip has been successfully uploaded! Final size is 681492 bytes. Artifact ID is 4764717455 2025-12-04T13:49:24.8175730Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922798714/artifacts/4764717455 2025-12-04T13:49:24.8329444Z ##[group]Run actions/upload-artifact@v4 2025-12-04T13:49:24.8329593Z with: 2025-12-04T13:49:24.8329784Z name: logs-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip 2025-12-04T13:49:24.8329998Z retention-days: 14 2025-12-04T13:49:24.8330116Z if-no-files-found: ignore 2025-12-04T13:49:24.8330244Z path: usage_log.txt test/**/*.log 2025-12-04T13:49:24.8330387Z compression-level: 6 2025-12-04T13:49:24.8330498Z overwrite: false 2025-12-04T13:49:24.8330610Z include-hidden-files: false 2025-12-04T13:49:24.8330729Z env: 2025-12-04T13:49:24.8330824Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:49:24.8330969Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:49:24.8331323Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:49:24.8331500Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:49:24.8332024Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:49:24.8332583Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:49:24.8332713Z AWS_REGION: us-east-1 2025-12-04T13:49:24.8332883Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:49:24.8333097Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:49:24.8335162Z AWS_SESSION_TOKEN: *** 2025-12-04T13:49:24.8335343Z CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T13:49:24.8335534Z ##[endgroup] 2025-12-04T13:49:25.2826191Z Multiple search paths detected. Calculating the least common ancestor of all paths 2025-12-04T13:49:25.2827282Z The least common ancestor is /home/runner/_work/pytorch/pytorch. This will be the root directory of the artifact 2025-12-04T13:49:25.2827594Z With the provided path, there will be 84 files uploaded 2025-12-04T13:49:25.2830552Z Artifact name is valid! 2025-12-04T13:49:25.2831232Z Root directory input is valid! 2025-12-04T13:49:25.5111496Z Beginning upload of artifact content to blob storage 2025-12-04T13:49:26.0655357Z Uploaded bytes 395449 2025-12-04T13:49:26.1332648Z Finished uploading artifact content to blob storage! 2025-12-04T13:49:26.1333946Z SHA256 digest of uploaded artifact zip is b29fd9d0f808ab53863051eb5997c3790697a39efb0251c741829f3455d61657 2025-12-04T13:49:26.1334892Z Finalizing artifact upload 2025-12-04T13:49:26.2871280Z Artifact logs-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip.zip successfully finalized. Artifact ID 4764717750 2025-12-04T13:49:26.2872707Z Artifact logs-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip has been successfully uploaded! Final size is 395449 bytes. Artifact ID is 4764717750 2025-12-04T13:49:26.2876559Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922798714/artifacts/4764717750 2025-12-04T13:49:26.3020874Z ##[group]Run # shellcheck disable=SC2156 2025-12-04T13:49:26.3021111Z # shellcheck disable=SC2156 2025-12-04T13:49:26.3021413Z find . -iname "core.[1-9]*" -exec docker exec "${CONTAINER_NAME}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-12-04T13:49:26.3026009Z shell: /usr/bin/bash -e {0} 2025-12-04T13:49:26.3026189Z env: 2025-12-04T13:49:26.3037183Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:49:26.3037368Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:49:26.3037568Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:49:26.3037746Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:49:26.3038550Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:49:26.3039059Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:49:26.3039193Z AWS_REGION: us-east-1 2025-12-04T13:49:26.3039380Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:49:26.3039551Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:49:26.3041566Z AWS_SESSION_TOKEN: *** 2025-12-04T13:49:26.3041752Z CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T13:49:26.3041952Z ##[endgroup] 2025-12-04T13:49:26.4389640Z ##[group]Run actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 2025-12-04T13:49:26.4389840Z with: 2025-12-04T13:49:26.4389988Z name: coredumps-distributed-2-3-linux.rocm.gpu.gfx942.4.b 2025-12-04T13:49:26.4390163Z retention-days: 14 2025-12-04T13:49:26.4390278Z if-no-files-found: ignore 2025-12-04T13:49:26.4390402Z path: ./**/core.[1-9]* 2025-12-04T13:49:26.4390522Z compression-level: 6 2025-12-04T13:49:26.4390632Z overwrite: false 2025-12-04T13:49:26.4390747Z include-hidden-files: false 2025-12-04T13:49:26.4390868Z env: 2025-12-04T13:49:26.4391052Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:49:26.4391201Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:49:26.4391395Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:49:26.4391572Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:49:26.4392114Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:49:26.4392690Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:49:26.4392820Z AWS_REGION: us-east-1 2025-12-04T13:49:26.4392998Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:49:26.4393170Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:49:26.4395224Z AWS_SESSION_TOKEN: *** 2025-12-04T13:49:26.4395413Z CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d 2025-12-04T13:49:26.4395614Z ##[endgroup] 2025-12-04T13:49:29.8994054Z No files were found with the provided path: ./**/core.[1-9]*. No artifacts will be uploaded. 2025-12-04T13:49:29.9154850Z Post job cleanup. 2025-12-04T13:49:29.9167896Z Post job cleanup. 2025-12-04T13:49:29.9372905Z Logging out of registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T13:49:29.9593722Z Post job cleanup. 2025-12-04T13:49:30.0259082Z Post job cleanup. 2025-12-04T13:49:30.0290231Z Post job cleanup. 2025-12-04T13:49:30.0755018Z [command]/usr/bin/git version 2025-12-04T13:49:30.0781385Z git version 2.52.0 2025-12-04T13:49:30.0805407Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/10fbb372-00e7-4b77-8d9d-8659fcf59d40/.gitconfig' 2025-12-04T13:49:30.0811652Z Temporarily overriding HOME='/home/runner/_work/_temp/10fbb372-00e7-4b77-8d9d-8659fcf59d40' before making global git config changes 2025-12-04T13:49:30.0812007Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T13:49:30.0814224Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T13:49:30.0842028Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T13:49:30.0865675Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T13:49:30.1088060Z Entering 'android/libs/fbjni' 2025-12-04T13:49:30.1127411Z Entering 'third_party/FP16' 2025-12-04T13:49:30.1155215Z Entering 'third_party/FXdiv' 2025-12-04T13:49:30.1181340Z Entering 'third_party/NNPACK' 2025-12-04T13:49:30.1214462Z Entering 'third_party/NVTX' 2025-12-04T13:49:30.1256051Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:49:30.1285454Z Entering 'third_party/XNNPACK' 2025-12-04T13:49:30.1326092Z Entering 'third_party/aiter' 2025-12-04T13:49:30.1355726Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:49:30.1387422Z Entering 'third_party/benchmark' 2025-12-04T13:49:30.1411470Z Entering 'third_party/composable_kernel' 2025-12-04T13:49:30.1441453Z Entering 'third_party/cpp-httplib' 2025-12-04T13:49:30.1466007Z Entering 'third_party/cpuinfo' 2025-12-04T13:49:30.1491907Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:49:30.1520183Z Entering 'third_party/cutlass' 2025-12-04T13:49:30.1548883Z Entering 'third_party/fbgemm' 2025-12-04T13:49:30.1582881Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:49:30.1612750Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:49:30.1640764Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:49:30.1665615Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:49:30.1693823Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:49:30.1722345Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:49:30.1746167Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:49:30.1775426Z Entering 'third_party/flash-attention' 2025-12-04T13:49:30.1806253Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:49:30.1832252Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:49:30.1860323Z Entering 'third_party/flatbuffers' 2025-12-04T13:49:30.1886809Z Entering 'third_party/fmt' 2025-12-04T13:49:30.1913403Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:49:30.1938526Z Entering 'third_party/gloo' 2025-12-04T13:49:30.1963794Z Entering 'third_party/googletest' 2025-12-04T13:49:30.1995769Z Entering 'third_party/ideep' 2025-12-04T13:49:30.2021633Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:49:30.2067129Z Entering 'third_party/ittapi' 2025-12-04T13:49:30.2092911Z Entering 'third_party/kineto' 2025-12-04T13:49:30.2123396Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:49:30.2151018Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:49:30.2178999Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:49:30.2206768Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:49:30.2230297Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:49:30.2258695Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:49:30.2289889Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:49:30.2313941Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:49:30.2336140Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:49:30.2360328Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:49:30.2383570Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:49:30.2407303Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:30.2443387Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:30.2471722Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:49:30.2496450Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:49:30.2522966Z Entering 'third_party/kleidiai' 2025-12-04T13:49:30.2550531Z Entering 'third_party/mimalloc' 2025-12-04T13:49:30.2573623Z Entering 'third_party/nlohmann' 2025-12-04T13:49:30.2600826Z Entering 'third_party/onnx' 2025-12-04T13:49:30.2631923Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:49:30.2670112Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:49:30.2695063Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:49:30.2725613Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:49:30.2759380Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:49:30.2782429Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:49:30.2817427Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:49:30.2840081Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:49:30.2864904Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:49:30.2888614Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:30.2920145Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:30.2945725Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:49:30.2977988Z Entering 'third_party/pocketfft' 2025-12-04T13:49:30.3002196Z Entering 'third_party/protobuf' 2025-12-04T13:49:30.3028292Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:49:30.3054312Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:49:30.3084579Z Entering 'third_party/psimd' 2025-12-04T13:49:30.3115553Z Entering 'third_party/pthreadpool' 2025-12-04T13:49:30.3139423Z Entering 'third_party/pybind11' 2025-12-04T13:49:30.3164980Z Entering 'third_party/python-peachpy' 2025-12-04T13:49:30.3189376Z Entering 'third_party/sleef' 2025-12-04T13:49:30.3216490Z Entering 'third_party/tensorpipe' 2025-12-04T13:49:30.3240614Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:49:30.3263506Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:49:30.3287083Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:49:30.3310151Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:49:30.3337605Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:49:30.3381138Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T13:49:30.3399848Z http.https://github.com/.extraheader 2025-12-04T13:49:30.3406763Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-12-04T13:49:30.3425851Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T13:49:30.3644962Z Entering 'android/libs/fbjni' 2025-12-04T13:49:30.3661130Z http.https://github.com/.extraheader 2025-12-04T13:49:30.3684405Z Entering 'third_party/FP16' 2025-12-04T13:49:30.3703340Z http.https://github.com/.extraheader 2025-12-04T13:49:30.3723210Z Entering 'third_party/FXdiv' 2025-12-04T13:49:30.3740809Z http.https://github.com/.extraheader 2025-12-04T13:49:30.3757567Z Entering 'third_party/NNPACK' 2025-12-04T13:49:30.3774816Z http.https://github.com/.extraheader 2025-12-04T13:49:30.3794288Z Entering 'third_party/NVTX' 2025-12-04T13:49:30.3816644Z http.https://github.com/.extraheader 2025-12-04T13:49:30.3836611Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:49:30.3851850Z http.https://github.com/.extraheader 2025-12-04T13:49:30.3870396Z Entering 'third_party/XNNPACK' 2025-12-04T13:49:30.3887301Z http.https://github.com/.extraheader 2025-12-04T13:49:30.3917905Z Entering 'third_party/aiter' 2025-12-04T13:49:30.3935308Z http.https://github.com/.extraheader 2025-12-04T13:49:30.3955804Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:49:30.3972611Z http.https://github.com/.extraheader 2025-12-04T13:49:30.3997159Z Entering 'third_party/benchmark' 2025-12-04T13:49:30.4019833Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4045619Z Entering 'third_party/composable_kernel' 2025-12-04T13:49:30.4064318Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4098288Z Entering 'third_party/cpp-httplib' 2025-12-04T13:49:30.4116230Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4136216Z Entering 'third_party/cpuinfo' 2025-12-04T13:49:30.4153125Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4172713Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:49:30.4186358Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4204239Z Entering 'third_party/cutlass' 2025-12-04T13:49:30.4226584Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4249480Z Entering 'third_party/fbgemm' 2025-12-04T13:49:30.4265493Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4286428Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:49:30.4306702Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4324920Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:49:30.4350931Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4373609Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:49:30.4394280Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4417557Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:49:30.4435551Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4459347Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:49:30.4480137Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4506912Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:49:30.4527083Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4545500Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:49:30.4562274Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4588298Z Entering 'third_party/flash-attention' 2025-12-04T13:49:30.4605586Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4624434Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:49:30.4640178Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4663970Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:49:30.4680993Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4703459Z Entering 'third_party/flatbuffers' 2025-12-04T13:49:30.4726524Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4757134Z Entering 'third_party/fmt' 2025-12-04T13:49:30.4776077Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4796408Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:49:30.4814935Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4835971Z Entering 'third_party/gloo' 2025-12-04T13:49:30.4858526Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4876650Z Entering 'third_party/googletest' 2025-12-04T13:49:30.4897324Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4924581Z Entering 'third_party/ideep' 2025-12-04T13:49:30.4943506Z http.https://github.com/.extraheader 2025-12-04T13:49:30.4965769Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:49:30.4983594Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5007964Z Entering 'third_party/ittapi' 2025-12-04T13:49:30.5024182Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5045645Z Entering 'third_party/kineto' 2025-12-04T13:49:30.5060618Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5091917Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:49:30.5106353Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5135962Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:49:30.5152946Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5177603Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:49:30.5205342Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5226018Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:49:30.5246255Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5276723Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:49:30.5293989Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5320119Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:49:30.5334121Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5356652Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:49:30.5383362Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5404083Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:49:30.5418551Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5438842Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:49:30.5460431Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5480148Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:49:30.5501832Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5522920Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:49:30.5538452Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5559706Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:30.5575842Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5597429Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:30.5613868Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5643944Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:49:30.5660958Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5679307Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:49:30.5695706Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5717574Z Entering 'third_party/kleidiai' 2025-12-04T13:49:30.5732976Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5754138Z Entering 'third_party/mimalloc' 2025-12-04T13:49:30.5773835Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5792630Z Entering 'third_party/nlohmann' 2025-12-04T13:49:30.5808464Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5830647Z Entering 'third_party/onnx' 2025-12-04T13:49:30.5846466Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5875795Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:49:30.5897229Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5922578Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:49:30.5937505Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5959206Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:49:30.5973429Z http.https://github.com/.extraheader 2025-12-04T13:49:30.5994163Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:49:30.6010569Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6029074Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:49:30.6049452Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6068553Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:49:30.6081443Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6100988Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:49:30.6115332Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6135119Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:49:30.6149989Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6172549Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:49:30.6190854Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6208020Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:30.6225652Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6246021Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:30.6265412Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6285485Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:49:30.6303979Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6331226Z Entering 'third_party/pocketfft' 2025-12-04T13:49:30.6347018Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6365639Z Entering 'third_party/protobuf' 2025-12-04T13:49:30.6388502Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6416447Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:49:30.6433048Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6459134Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:49:30.6479461Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6502182Z Entering 'third_party/psimd' 2025-12-04T13:49:30.6524755Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6546479Z Entering 'third_party/pthreadpool' 2025-12-04T13:49:30.6563420Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6590105Z Entering 'third_party/pybind11' 2025-12-04T13:49:30.6612764Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6632786Z Entering 'third_party/python-peachpy' 2025-12-04T13:49:30.6658329Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6686744Z Entering 'third_party/sleef' 2025-12-04T13:49:30.6701468Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6724862Z Entering 'third_party/tensorpipe' 2025-12-04T13:49:30.6739921Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6757460Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:49:30.6773382Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6794305Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:49:30.6809163Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6829499Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:49:30.6842414Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6863432Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:49:30.6877146Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6895300Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:49:30.6914499Z http.https://github.com/.extraheader 2025-12-04T13:49:30.6959426Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.6986480Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T13:49:30.7159803Z Entering 'android/libs/fbjni' 2025-12-04T13:49:30.7177360Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T13:49:30.7188507Z Entering 'third_party/FP16' 2025-12-04T13:49:30.7200570Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T13:49:30.7211441Z Entering 'third_party/FXdiv' 2025-12-04T13:49:30.7230886Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T13:49:30.7241713Z Entering 'third_party/NNPACK' 2025-12-04T13:49:30.7252945Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T13:49:30.7262989Z Entering 'third_party/NVTX' 2025-12-04T13:49:30.7274455Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T13:49:30.7284003Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:49:30.7295345Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T13:49:30.7305951Z Entering 'third_party/XNNPACK' 2025-12-04T13:49:30.7317892Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T13:49:30.7337298Z Entering 'third_party/aiter' 2025-12-04T13:49:30.7349531Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T13:49:30.7358814Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:49:30.7376250Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T13:49:30.7390310Z Entering 'third_party/benchmark' 2025-12-04T13:49:30.7401789Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:49:30.7411943Z Entering 'third_party/composable_kernel' 2025-12-04T13:49:30.7423150Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T13:49:30.7436653Z Entering 'third_party/cpp-httplib' 2025-12-04T13:49:30.7455552Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T13:49:30.7466282Z Entering 'third_party/cpuinfo' 2025-12-04T13:49:30.7477127Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T13:49:30.7486704Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:49:30.7498791Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T13:49:30.7509141Z Entering 'third_party/cutlass' 2025-12-04T13:49:30.7522759Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T13:49:30.7540877Z Entering 'third_party/fbgemm' 2025-12-04T13:49:30.7555905Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T13:49:30.7566362Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:49:30.7581359Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T13:49:30.7591062Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:49:30.7602819Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T13:49:30.7614854Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:49:30.7625021Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T13:49:30.7639180Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:49:30.7654118Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T13:49:30.7671186Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:49:30.7683932Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T13:49:30.7694221Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:49:30.7705567Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T13:49:30.7715413Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:49:30.7730503Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T13:49:30.7742168Z Entering 'third_party/flash-attention' 2025-12-04T13:49:30.7756040Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T13:49:30.7765657Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:49:30.7777322Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T13:49:30.7788943Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:49:30.7806964Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T13:49:30.7821451Z Entering 'third_party/flatbuffers' 2025-12-04T13:49:30.7834192Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T13:49:30.7845397Z Entering 'third_party/fmt' 2025-12-04T13:49:30.7857265Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T13:49:30.7867728Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:49:30.7878828Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T13:49:30.7888801Z Entering 'third_party/gloo' 2025-12-04T13:49:30.7900814Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T13:49:30.7911523Z Entering 'third_party/googletest' 2025-12-04T13:49:30.7922786Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:49:30.7935301Z Entering 'third_party/ideep' 2025-12-04T13:49:30.7946781Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T13:49:30.7957165Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:49:30.7968829Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T13:49:30.7981778Z Entering 'third_party/ittapi' 2025-12-04T13:49:30.7993098Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T13:49:30.8003206Z Entering 'third_party/kineto' 2025-12-04T13:49:30.8015369Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T13:49:30.8025170Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:49:30.8041432Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T13:49:30.8052029Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:49:30.8063538Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T13:49:30.8073566Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:49:30.8095239Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T13:49:30.8104898Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:49:30.8117174Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T13:49:30.8126915Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:49:30.8141278Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T13:49:30.8155417Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:49:30.8176854Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T13:49:30.8188992Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:49:30.8200194Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T13:49:30.8217227Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:49:30.8235276Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:49:30.8246197Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:49:30.8257193Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T13:49:30.8266537Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:49:30.8278278Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T13:49:30.8287417Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:49:30.8301215Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T13:49:30.8310996Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:30.8326850Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T13:49:30.8337715Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:30.8351682Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T13:49:30.8365309Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:49:30.8378902Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T13:49:30.8388970Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:49:30.8406569Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T13:49:30.8418878Z Entering 'third_party/kleidiai' 2025-12-04T13:49:30.8430434Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T13:49:30.8440933Z Entering 'third_party/mimalloc' 2025-12-04T13:49:30.8452094Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T13:49:30.8463143Z Entering 'third_party/nlohmann' 2025-12-04T13:49:30.8475765Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T13:49:30.8492660Z Entering 'third_party/onnx' 2025-12-04T13:49:30.8505055Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T13:49:30.8521078Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:49:30.8531760Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:49:30.8544568Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:49:30.8558514Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T13:49:30.8568698Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:49:30.8580122Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:49:30.8589548Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:49:30.8600526Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:49:30.8609933Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:49:30.8624629Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T13:49:30.8633432Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:49:30.8644502Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T13:49:30.8653264Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:49:30.8668025Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T13:49:30.8677132Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:49:30.8688225Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T13:49:30.8696274Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:49:30.8706207Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T13:49:30.8715064Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:30.8732864Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T13:49:30.8743597Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:30.8755451Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T13:49:30.8766720Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:49:30.8776562Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T13:49:30.8793482Z Entering 'third_party/pocketfft' 2025-12-04T13:49:30.8805311Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T13:49:30.8815602Z Entering 'third_party/protobuf' 2025-12-04T13:49:30.8827485Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T13:49:30.8838295Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:49:30.8848731Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:49:30.8858918Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:49:30.8874293Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:49:30.8887343Z Entering 'third_party/psimd' 2025-12-04T13:49:30.8901938Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T13:49:30.8912114Z Entering 'third_party/pthreadpool' 2025-12-04T13:49:30.8934199Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T13:49:30.8944465Z Entering 'third_party/pybind11' 2025-12-04T13:49:30.8957878Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:49:30.8968310Z Entering 'third_party/python-peachpy' 2025-12-04T13:49:30.8980474Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T13:49:30.8990096Z Entering 'third_party/sleef' 2025-12-04T13:49:30.9003310Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T13:49:30.9015049Z Entering 'third_party/tensorpipe' 2025-12-04T13:49:30.9027201Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T13:49:30.9038577Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:49:30.9052714Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:49:30.9066278Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:49:30.9080133Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T13:49:30.9090048Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:49:30.9101363Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T13:49:30.9110859Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:49:30.9124149Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:49:30.9133474Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:49:30.9143830Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T13:49:30.9179587Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9200654Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9219563Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9236899Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9254490Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9270871Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9287742Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9306943Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9323412Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9339893Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9357070Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9373934Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9390476Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9409506Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9425180Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9440995Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9456779Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9472966Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9492535Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9506857Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9527710Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9545513Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9561106Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9578204Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9593790Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9612736Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9629795Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9645620Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9661812Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9677397Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9693041Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9709399Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9725530Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9741751Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9760313Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9779110Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9796044Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9826204Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9834753Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9853404Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9872919Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9897789Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9914533Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9931203Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9947546Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9963913Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:30.9986749Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0003124Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0019990Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0037409Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0054449Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0070128Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0086369Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0103531Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0120306Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0136130Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0151611Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0170141Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0187296Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0205769Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0222536Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0242794Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0259143Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0275893Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0292025Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0309492Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0326043Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0343921Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0361382Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0377952Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0393185Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0409791Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0426825Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0449480Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0465917Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0482552Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0499457Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0519892Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0536242Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0553257Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0571076Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.0675972Z Post job cleanup. 2025-12-04T13:49:31.1142171Z [command]/usr/bin/git version 2025-12-04T13:49:31.1163166Z git version 2.52.0 2025-12-04T13:49:31.1179229Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/55e9c632-7928-4a62-ba87-db6341ba1ccf/.gitconfig' 2025-12-04T13:49:31.1184344Z Temporarily overriding HOME='/home/runner/_work/_temp/55e9c632-7928-4a62-ba87-db6341ba1ccf' before making global git config changes 2025-12-04T13:49:31.1184800Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T13:49:31.1186433Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T13:49:31.1207658Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T13:49:31.1232899Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T13:49:31.1423488Z Entering 'android/libs/fbjni' 2025-12-04T13:49:31.1449470Z Entering 'third_party/FP16' 2025-12-04T13:49:31.1475253Z Entering 'third_party/FXdiv' 2025-12-04T13:49:31.1501911Z Entering 'third_party/NNPACK' 2025-12-04T13:49:31.1535542Z Entering 'third_party/NVTX' 2025-12-04T13:49:31.1568830Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:49:31.1596457Z Entering 'third_party/XNNPACK' 2025-12-04T13:49:31.1629528Z Entering 'third_party/aiter' 2025-12-04T13:49:31.1655352Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:49:31.1685138Z Entering 'third_party/benchmark' 2025-12-04T13:49:31.1710051Z Entering 'third_party/composable_kernel' 2025-12-04T13:49:31.1742827Z Entering 'third_party/cpp-httplib' 2025-12-04T13:49:31.1775384Z Entering 'third_party/cpuinfo' 2025-12-04T13:49:31.1806552Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:49:31.1837929Z Entering 'third_party/cutlass' 2025-12-04T13:49:31.1874675Z Entering 'third_party/fbgemm' 2025-12-04T13:49:31.1901768Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:49:31.1927677Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:49:31.1955597Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:49:31.1979710Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:49:31.2008546Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:49:31.2046172Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:49:31.2072932Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:49:31.2100819Z Entering 'third_party/flash-attention' 2025-12-04T13:49:31.2129542Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:49:31.2160269Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:49:31.2193852Z Entering 'third_party/flatbuffers' 2025-12-04T13:49:31.2224651Z Entering 'third_party/fmt' 2025-12-04T13:49:31.2251006Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:49:31.2275988Z Entering 'third_party/gloo' 2025-12-04T13:49:31.2300043Z Entering 'third_party/googletest' 2025-12-04T13:49:31.2325147Z Entering 'third_party/ideep' 2025-12-04T13:49:31.2356855Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:49:31.2386245Z Entering 'third_party/ittapi' 2025-12-04T13:49:31.2412907Z Entering 'third_party/kineto' 2025-12-04T13:49:31.2440306Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:49:31.2465549Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:49:31.2497319Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:49:31.2524413Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:49:31.2554270Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:49:31.2585701Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:49:31.2612206Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:49:31.2643508Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:49:31.2667385Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:49:31.2691715Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:49:31.2718829Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:49:31.2749523Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:31.2775491Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:31.2808427Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:49:31.2845828Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:49:31.2878810Z Entering 'third_party/kleidiai' 2025-12-04T13:49:31.2907708Z Entering 'third_party/mimalloc' 2025-12-04T13:49:31.2942861Z Entering 'third_party/nlohmann' 2025-12-04T13:49:31.2978935Z Entering 'third_party/onnx' 2025-12-04T13:49:31.3013159Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:49:31.3046541Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:49:31.3076951Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:49:31.3109396Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:49:31.3137869Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:49:31.3167352Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:49:31.3200898Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:49:31.3226007Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:49:31.3259418Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:49:31.3287254Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:31.3317646Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:31.3344334Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:49:31.3377264Z Entering 'third_party/pocketfft' 2025-12-04T13:49:31.3403897Z Entering 'third_party/protobuf' 2025-12-04T13:49:31.3433334Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:49:31.3461190Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:49:31.3493671Z Entering 'third_party/psimd' 2025-12-04T13:49:31.3522583Z Entering 'third_party/pthreadpool' 2025-12-04T13:49:31.3556561Z Entering 'third_party/pybind11' 2025-12-04T13:49:31.3584722Z Entering 'third_party/python-peachpy' 2025-12-04T13:49:31.3615218Z Entering 'third_party/sleef' 2025-12-04T13:49:31.3646148Z Entering 'third_party/tensorpipe' 2025-12-04T13:49:31.3682055Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:49:31.3717326Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:49:31.3742920Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:49:31.3768911Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:49:31.3795094Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:49:31.3843885Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T13:49:31.3865002Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T13:49:31.4027175Z Entering 'android/libs/fbjni' 2025-12-04T13:49:31.4052664Z Entering 'third_party/FP16' 2025-12-04T13:49:31.4075790Z Entering 'third_party/FXdiv' 2025-12-04T13:49:31.4100949Z Entering 'third_party/NNPACK' 2025-12-04T13:49:31.4126637Z Entering 'third_party/NVTX' 2025-12-04T13:49:31.4151514Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:49:31.4177093Z Entering 'third_party/XNNPACK' 2025-12-04T13:49:31.4207892Z Entering 'third_party/aiter' 2025-12-04T13:49:31.4234974Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:49:31.4265629Z Entering 'third_party/benchmark' 2025-12-04T13:49:31.4290574Z Entering 'third_party/composable_kernel' 2025-12-04T13:49:31.4320258Z Entering 'third_party/cpp-httplib' 2025-12-04T13:49:31.4345246Z Entering 'third_party/cpuinfo' 2025-12-04T13:49:31.4369787Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:49:31.4394788Z Entering 'third_party/cutlass' 2025-12-04T13:49:31.4429434Z Entering 'third_party/fbgemm' 2025-12-04T13:49:31.4455771Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:49:31.4479186Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:49:31.4512287Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:49:31.4536465Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:49:31.4564435Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:49:31.4588946Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:49:31.4615437Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:49:31.4642430Z Entering 'third_party/flash-attention' 2025-12-04T13:49:31.4668545Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:49:31.4713335Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:49:31.4752245Z Entering 'third_party/flatbuffers' 2025-12-04T13:49:31.4787044Z Entering 'third_party/fmt' 2025-12-04T13:49:31.4820924Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:49:31.4851578Z Entering 'third_party/gloo' 2025-12-04T13:49:31.4884998Z Entering 'third_party/googletest' 2025-12-04T13:49:31.4912030Z Entering 'third_party/ideep' 2025-12-04T13:49:31.4939148Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:49:31.4970608Z Entering 'third_party/ittapi' 2025-12-04T13:49:31.4996953Z Entering 'third_party/kineto' 2025-12-04T13:49:31.5028861Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:49:31.5053817Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:49:31.5083691Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:49:31.5121342Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:49:31.5146339Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:49:31.5172382Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:49:31.5204106Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:49:31.5233374Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:49:31.5263643Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:49:31.5296019Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:49:31.5321782Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:49:31.5347945Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:31.5373653Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:31.5408998Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:49:31.5436782Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:49:31.5463481Z Entering 'third_party/kleidiai' 2025-12-04T13:49:31.5489034Z Entering 'third_party/mimalloc' 2025-12-04T13:49:31.5514347Z Entering 'third_party/nlohmann' 2025-12-04T13:49:31.5540002Z Entering 'third_party/onnx' 2025-12-04T13:49:31.5571318Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:49:31.5599735Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:49:31.5626203Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:49:31.5650131Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:49:31.5679328Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:49:31.5703718Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:49:31.5733538Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:49:31.5757035Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:49:31.5782513Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:49:31.5811476Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:31.5841218Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:31.5871095Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:49:31.5907536Z Entering 'third_party/pocketfft' 2025-12-04T13:49:31.5944481Z Entering 'third_party/protobuf' 2025-12-04T13:49:31.5972435Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:49:31.6007202Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:49:31.6039622Z Entering 'third_party/psimd' 2025-12-04T13:49:31.6066704Z Entering 'third_party/pthreadpool' 2025-12-04T13:49:31.6096359Z Entering 'third_party/pybind11' 2025-12-04T13:49:31.6127340Z Entering 'third_party/python-peachpy' 2025-12-04T13:49:31.6152513Z Entering 'third_party/sleef' 2025-12-04T13:49:31.6179681Z Entering 'third_party/tensorpipe' 2025-12-04T13:49:31.6209191Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:49:31.6239573Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:49:31.6265444Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:49:31.6290492Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:49:31.6316452Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:49:31.6364937Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.6389974Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T13:49:31.6590062Z Entering 'android/libs/fbjni' 2025-12-04T13:49:31.6604842Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T13:49:31.6617542Z Entering 'third_party/FP16' 2025-12-04T13:49:31.6630981Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T13:49:31.6642698Z Entering 'third_party/FXdiv' 2025-12-04T13:49:31.6657326Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T13:49:31.6667522Z Entering 'third_party/NNPACK' 2025-12-04T13:49:31.6680155Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T13:49:31.6688503Z Entering 'third_party/NVTX' 2025-12-04T13:49:31.6701606Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T13:49:31.6711425Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:49:31.6729852Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T13:49:31.6737746Z Entering 'third_party/XNNPACK' 2025-12-04T13:49:31.6751249Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T13:49:31.6767394Z Entering 'third_party/aiter' 2025-12-04T13:49:31.6778537Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T13:49:31.6788395Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:49:31.6798750Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T13:49:31.6820792Z Entering 'third_party/benchmark' 2025-12-04T13:49:31.6835539Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:49:31.6845407Z Entering 'third_party/composable_kernel' 2025-12-04T13:49:31.6859055Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T13:49:31.6876377Z Entering 'third_party/cpp-httplib' 2025-12-04T13:49:31.6888783Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T13:49:31.6899116Z Entering 'third_party/cpuinfo' 2025-12-04T13:49:31.6910488Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T13:49:31.6920386Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:49:31.6933241Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T13:49:31.6942536Z Entering 'third_party/cutlass' 2025-12-04T13:49:31.6957701Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T13:49:31.6973978Z Entering 'third_party/fbgemm' 2025-12-04T13:49:31.6988963Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T13:49:31.6999636Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:49:31.7010588Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T13:49:31.7020338Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:49:31.7035431Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T13:49:31.7048108Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:49:31.7060440Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T13:49:31.7069691Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:49:31.7086414Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T13:49:31.7102553Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:49:31.7121591Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T13:49:31.7129372Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:49:31.7141097Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T13:49:31.7149698Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:49:31.7168919Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T13:49:31.7189278Z Entering 'third_party/flash-attention' 2025-12-04T13:49:31.7205055Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T13:49:31.7220286Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:49:31.7230442Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T13:49:31.7247547Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:49:31.7258811Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T13:49:31.7279447Z Entering 'third_party/flatbuffers' 2025-12-04T13:49:31.7292751Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T13:49:31.7305661Z Entering 'third_party/fmt' 2025-12-04T13:49:31.7319959Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T13:49:31.7330714Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:49:31.7345977Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T13:49:31.7359231Z Entering 'third_party/gloo' 2025-12-04T13:49:31.7370899Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T13:49:31.7381763Z Entering 'third_party/googletest' 2025-12-04T13:49:31.7395553Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:49:31.7410417Z Entering 'third_party/ideep' 2025-12-04T13:49:31.7420957Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T13:49:31.7429796Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:49:31.7442522Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T13:49:31.7456103Z Entering 'third_party/ittapi' 2025-12-04T13:49:31.7468864Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T13:49:31.7481603Z Entering 'third_party/kineto' 2025-12-04T13:49:31.7497720Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T13:49:31.7510370Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:49:31.7525085Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T13:49:31.7534999Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:49:31.7548072Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T13:49:31.7557915Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:49:31.7575081Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T13:49:31.7583948Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:49:31.7599725Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T13:49:31.7610366Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:49:31.7624563Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T13:49:31.7634330Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:49:31.7648438Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T13:49:31.7659718Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:49:31.7673045Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T13:49:31.7681426Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:49:31.7695371Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:49:31.7704951Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:49:31.7717235Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T13:49:31.7726507Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:49:31.7742320Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T13:49:31.7753545Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:49:31.7767025Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T13:49:31.7777247Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:31.7789080Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T13:49:31.7805034Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:31.7820736Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T13:49:31.7838664Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:49:31.7851579Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T13:49:31.7860938Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:49:31.7873877Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T13:49:31.7886742Z Entering 'third_party/kleidiai' 2025-12-04T13:49:31.7898525Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T13:49:31.7908747Z Entering 'third_party/mimalloc' 2025-12-04T13:49:31.7921209Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T13:49:31.7931473Z Entering 'third_party/nlohmann' 2025-12-04T13:49:31.7942979Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T13:49:31.7953611Z Entering 'third_party/onnx' 2025-12-04T13:49:31.7965206Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T13:49:31.7981617Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:49:31.7992662Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:49:31.8004966Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:49:31.8016638Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T13:49:31.8025944Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:49:31.8038243Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:49:31.8050552Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:49:31.8063468Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:49:31.8073087Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:49:31.8087707Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T13:49:31.8101884Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:49:31.8116468Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T13:49:31.8128901Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:49:31.8146890Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T13:49:31.8159141Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:49:31.8174140Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T13:49:31.8184047Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:49:31.8195264Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T13:49:31.8204886Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:49:31.8217997Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T13:49:31.8229067Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:49:31.8241020Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T13:49:31.8252895Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:49:31.8265070Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T13:49:31.8281126Z Entering 'third_party/pocketfft' 2025-12-04T13:49:31.8292383Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T13:49:31.8302245Z Entering 'third_party/protobuf' 2025-12-04T13:49:31.8313052Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T13:49:31.8323758Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:49:31.8334347Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:49:31.8347286Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:49:31.8364260Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:49:31.8377818Z Entering 'third_party/psimd' 2025-12-04T13:49:31.8389473Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T13:49:31.8399473Z Entering 'third_party/pthreadpool' 2025-12-04T13:49:31.8413248Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T13:49:31.8424252Z Entering 'third_party/pybind11' 2025-12-04T13:49:31.8435812Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:49:31.8445519Z Entering 'third_party/python-peachpy' 2025-12-04T13:49:31.8457400Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T13:49:31.8466453Z Entering 'third_party/sleef' 2025-12-04T13:49:31.8477576Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T13:49:31.8487327Z Entering 'third_party/tensorpipe' 2025-12-04T13:49:31.8499185Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T13:49:31.8508873Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:49:31.8522847Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:49:31.8532864Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:49:31.8546673Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T13:49:31.8556125Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:49:31.8577372Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T13:49:31.8589643Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:49:31.8605125Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:49:31.8614717Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:49:31.8626075Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T13:49:31.8656353Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8675301Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8693071Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8708489Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8724105Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8744796Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8760106Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8782424Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8789112Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8802580Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8819471Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8834542Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8850397Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8869545Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8887162Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8901871Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8916661Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8931146Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8950786Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8965771Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8981843Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.8999963Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9014805Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9029085Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9043688Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9063864Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9079554Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9095644Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9111175Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9125690Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9140685Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9155174Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9169624Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9184894Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9203256Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9217923Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9241659Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9259034Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9275262Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9291882Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9307689Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9324485Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9343586Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9359110Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9374078Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9390102Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9407941Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9424987Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9439898Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9457065Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9471989Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9486584Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9501061Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9515529Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9530016Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9545267Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9560929Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9575845Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9590980Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9615803Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9632391Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9645849Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9669828Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9696348Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9726842Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9745751Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9760013Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9775465Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9789795Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9804645Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9819501Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9833880Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9848344Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9863176Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9877009Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9891743Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9906186Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9920706Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9934021Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9951207Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:31.9965699Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:49:32.0072842Z Cleaning up orphan processes